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TITLE OF THE nSTVENTION 

DNA MOLECULES ENCODING MACACA MULATTA ANDROGEN 
RECEPTOR 

5 CROSS-REFEEUENCE TO RELATED APPUCATIONS 

The present application claims priority of U.S. provisional application 
Serial No. 60/289,573, filed May 8, 2001. 

FIELD OF THE INVENTION 

P The present invention relates in part to isolated nucleic acid molecules 

(polynucleotides) which encode aMacaca mulatta (rhesus monkey) androgen 
receptor (rhAR) protein. The present invention also relates to recombinant vectors 
and recombinant hosts which contain a DNA fragment encoding rhAR, substantially 
purified, biologically active forms of rhAR, including precursor and mature forms of 

5 the protein, mutant proteins which retain a biological activity of interest, methods 
associated with identifying compounds which modulate rhAR activity, and 
non-human animals which have been subject to intervention to effect rhAR activity. 

BACKGROUND OF THE INVENTION 
0 The nuclear receptor superfamily, which includes steroid hormone 

receptors, are small chemical ligand-inducible transcription factors which have been 
shown to play roles in controlling development, differentiation and physiological 
function. Isolation of cDNA clones encoding nuclear receptors reveals several 
characteristics. First, the NH2-terminal regions, or the A/B domain, which vary in 

5 length between receptors, are hypervariable with low homology between family 

members. There are three internal regions of conservation, referred to as domains C, 
D and E/F. Region C encodes a cysteine-rich region which is referred to as the DNA 
binding domain (DBD). Regions D and E/F are within the COOH-terminal section of 
the protein. Region D encodes the hinge domain which is also referred to as the 

0 Ugand binding domain (LBD). For a review, see Power et al. (1992, Trends in 
Pharmaceutical Sciences 13: 318-323). 

The lipophiHc hormones that activate steroid receptors are known to be 
associated with human diseases. Therefore, the respective nuclear receptors have 
been identified as possible targets for therapeutic intervention. For a review of the 
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mechanism of action of various steroid honnone receptors, see Tsai and O'Malley 
(1994, Annu. Rev. Biochem. 63: 451-486). 

Recent work with non-steroid nuclear receptors has also shown the 
potential as drug targets for therapeutic intervention. This work reports that 
5 peroxisome proliferator activated receptor g (PPARg), identified by a conserved DBD 
region, promotes adipocj^e differentiation upon activation and that 
thiazolidinediones, a class of antidiabetic drugs, function through PPARg (Tontonoz 
et al., 1994, Cell 79: 1147-1156; Lehmann et al., 1995, 7. Biol Chem. 270(22): 
12953-12956; Teboul et al., 1995, J. Biol Chem. 270(47): 28183-28187). This 
10 indicates that PPARg plays a role in glucose homeostasis and lipid metabolism. 

Mangelsdorf et al. (1995, Cell 83: 835-839) provide a review of 
known members of the nuclear receptor superf amily. 

U.S. Patent No. 5,614,620, issued to Liao and Chang on March 25, 
1997, discloses nucleotide sequences encoding human and rat androgen receptor, 
15 along with the complete amino acid sequence within the open reading frame of the 
respective androgen receptor. 

EP 0 365 657 Bl issued to French et al. August 4, 1999, discloses a 
recombinant DNA molecule encoding a human androgen receptor, along with the 
amino acid sequences of human androgen receptor protein. 
20 Choong et al. (1998, 7. Mol Evol 47: 334-342) disclose amino acid 

sequences for non-human primates such as chimpanzee, baboon, lemur and Macaca 
fascicularis (see SEQ ID NO:6 for nucleotide sequence, see also Gen Bank Accession 
No. U94179 for the nucleotide and amino acid sequence of Macaca fascicularis 
androgen receptor). 

25 Abdelgadir et al. (1999, Biology of Reproduction 60: 125 1-1256) 

disclose a PGR fragment representing a 5' portion of the Macaca mulatta coding 
region (see also Gen Bank Accession No. AF092930). 

It would be advantageous to identify additional genes closely related to 
the human androgen receptor gene, such as those possessed by nonhuman primates 

30 used for pharmacological investigation, which encode an androgen receptor protein. 
Since the androgen receptor plays an important role in regulating development, 
reproduction, and maintenance of bone and muscle, such genes, and their expressed 
functional proteins, will be useful in assays to select for compounds which modulate 
the biological activity of the androgen receptor, especially as this modulation pertains 
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to bone formation. The present invention addresses and meets these needs by 
disclosing isolated nucleic acid molecules which encode a full-length Macaca mullata 
androgen receptor. 

5 SUMMARY OF THE INVENTION 

The present invention relates in part to isolated nucleic acid molecules 
(polynucleotides) which encode a full length Macaca mulatta androgen receptor 
(rhAR), and the use of the expressed rhAR or portion thereof in the identification of 
androgen selective compounds active in bone formation. The isolated 

0 polynucleotides of the present invention encode a non-human primate member of this 
nuclear receptor superf anaily. The DNA molecules disclosed herein may be 
transfected into a host cell of choice wherein the recombinant host cell provides a 
source for substantial levels of an expressed functional rhAR. Such a functional 
nuclear receptor will provide for an effective target for use in screening methodology 

5 to identify modulators of the androgen receptor, modulators which may be effective 
in regulating development, reproduction and maintenance of bone and muscle. 

A preferred embodiment of the present invention is disclosed in Figure 
1 A-C and SEQ ID NO: 1, an isolated DNA molecule encoding rhAR. Nucleotide 
1051 is polymorphic, present as either a 'A' nucleotide or a 'G* nucleotide (see SEQ 

0 ID NO:3). 

To this end, another preferred embodiment of the present invention is 
an isolated DNA molecule as shown in Figure 1 A-C and SEQ ID NO: 1, except 
nucleotide 1051 is a C nucleotide instead of a 'A* nucleotide; this isolated DNA 
molecule being additionally disclosed as SEQ ID NO:3. 

5 The present invention also relates to isolated nucleic acid fragments 

which encode naRNA expressing a biologically active rhesus monkey androgen 
receptor which belongs to the nuclear receptor superfamily. A preferred embodiment 
relates to isolated nucleic acid fragments of SEQ ID NOs:l, and 3 which encode 
mRNA expressing a biologically functional derivative of rhAR, especially such 

0 nucleic acid fragments which encode all or a portion of the LBD and/or DBD regions 
of the rhAR open reading frame. 

The present invention also relates to recombinant vectors and 
recombinant hosts, both prokaryotic and eukaryotic, transfected and/or transformed to 
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contain the substantially purified nucleic acid molecules disclosed throughout this 
specification. 

A preferred aspect of the present invention relates to a substantially 
purified form of the novel nuclear trans-acting receptor protein, a rhesus androgen 
5 receptor protein, which is disclosed in Figures 2 (SEQ ID N0:2) as well as allelic 
variants of the protein disclosed in SEQ ID N0:2. One allelic variant is disclosed 
herein as SEQ ID NO:4. The Glu-210 residue of rhAR of SEQ ID NO:2 the parental 
allele. A single nucleotide change at nucleotide 1051 from A' (of SEQ ID N0:1) to 
Xj' (of SEQ ID NO:3) results in an amino acid change at residue 210 of the rhAR, 

10 from the Glu residue of SEQ ID NO:2 to a Gly-210 residue as disclosed in SEQ ID 
N0:4 as the allelic variant. 

Another preferred aspect of the present invention relates to a 
substantially purified, fully processed (including any proteolytic processing, 
glycosylation and/or phosphorylation) mature rhAR protein obtained from a 

15, recombinant host cell containing a DNA expression vector comprising a nucleotide 
sequence as set forth in SEQ ID NOs: 1 and 3, or nucleic acid fragments thereof as 
described above, such DNA expression vectors expressing the respective rhAR 
protein or rhAR precursor protein. It is especially preferred that the recombinant host 
cell be a eukaryotic host cell, including but not limited to a mammalian cell line, 

20 insect cell line, or yeast. 

The present invention also relates to biologically functional derivatives 
of rhAR as set forth as SEQ ID N0s:2 and 4, including but not limited to rhAR 
mutants and biologically active fragments such as amino acid substitutions, deletions, 
additions, amino terminal truncations and carboxy-terminal truncations, such that 

25 these fragments provide for proteins or protein fragments of diagnostic, therapeutic or 
prophylactic use and would be useful for screening for agonists and/or antagonists of 
rhAR function. 

The present invention also relates to a non-human transgenic animal 
which is useful for studying the ability of a variety of compounds to act as modulators 
30 of rhAR, or any alternative functional rhAR in vivo by providing cells for culture, in 
vitro. In reference to the transgenic animals of this invention, reference is made to 
transgenes and genes. As used herein, a transgene is a genetic construct including a 
gene. The transgene is integrated into one or more chromosomes in the cells in an 
animal by methods known in the art. Once integrated, the transgene is carried in at 



wo 02/090529 



PCT/US02/14175 



least one place in the chromosomes of a transgenic animal. Of course, a gene is a 
nucleotide sequence that encodes a protein, such as one or a combination of the 
cDNA clones described herein. The gene and/or transgene may also include genetic 
regulatory elements and/or structural elements known in the art. A type of target cell 
5 for transgene introduction is the embryonic stem cell (ES). ES cells can be obtained 
from pre-implantation embryos cultured in vitro and fused with embryos (Evans et 
al., 1981, Nature 292:154-156; Bradley et al., 1984, Nature 309:255-258; Gossler et 
al., 1986, Proc. Natl Acad. Sci. USA 83:9065-9069; and Robertson et al., 1986 
Nature 322:445-448). Transgenes can be efficiently introduced into the ES cells by a 

10 variety of standard techniques such as DNA transfection, microinjection, or by 

retrovirus-mediated transduction. The resultant transformed ES cells can thereafter 
be combined with blastocysts from a non-human animal. The introduced ES cells 
thereafter colonize the embryo and contribute to the germ line of the resulting 
chimeric animal (Jaenisch, 1988, Science 240: 1468-1474). It will also be within the 

15 purview of the skilled artisan to produce transgenic or knock-out invertebrate animals 
(e.g., C. elegans) which express the rhAR transgene in a wild type background as 
well in C. elegans mutants knocked out for one or both of the rhAR subunits. These 
organisms will be helpful in further detemndning the dominant negative effect of rhAR 
as well as selecting from compounds which modulate this effect. 

20 The present invention also relates to a non-human transgenic animal 

which is heterozygous for a functional rhAR gene native to that animal. As used 
herein, functional is used to describe a gene or protein that, when present in a cell or 
in vitro system, peiforms normally as if in a native or unaltered condition or 
environment. The animal of this aspect of the invention is useful for the study of the 

25 specific expression or activity of rhAR in an animal having only one functional copy 
of the gene. The animal is also useful for studying the ability of a variety of 
compounds to act as modulators of rhAR activity or expression in vivo or, by 
providing cells for culture, in vitro. It is reiterated that as used herein, a modulator is 
a compound that causes a change in the expression or activity of rhAR, or causes a 

30 change in the effect of the interaction of rhAR with its ligand(s), or other protein(s). 
In an embodiment of this aspect, the animal is used in a method for the preparation of 
a furtfier animal which lacks a functional native AR gene. In another embodiment, 
the animal of this aspect is used in a method to prepare an animal which expresses the 



-5- 



wo 02/090529 



PCT/US02/14175 



non-native rhAR gene in the absence of the expression of a native AR gene. In 
particular embodiments the non-human animal is a mouse. 

In reference to the transgenic animals of this invention, reference is 
made to transgenes and genes. As used herein, a transgene is a genetic construct 

5 including a gene. The transgene is integrated into one or more chromosomes in the 
cells in an animal by methods known in the art. Once integrated, the transgene is 
carried in at least one place in the chromosomes of a transgenic animal. Of course, a 
gene is a nucleotide sequence that encodes a protein, such as rhAR. The gene and/or 
transgene may also include genetic regulatory elements and/or structural elements 

0 known in the art. 

An aspect of this invention is a method of producing transgenic 
animals having a transgene including the non-native rhAR gene on a native AR null 
background. The method includes providing transgenic animals of this invention 
whose cells are heterozygous for a native gene encoding a functional rhAR protein 

5 and an altered native AR gene. These animals are crossed with transgenic animals of 
this invention that are hemizygous for a transgene including a non-native rhAR gene 
to obtain animals that are both heterozygous for an altered native AR gene and 
hemizygous for a non-native rhAR gene. The latter animals are interbred to obtain 
animals that are homozygous or hemizygous for the non-native rhAR and are 

10 homozygous for the altered native AR gene. In particular embodiments, cell lines are 
produced from any of the animals produced in the steps of the method. 

The transgenic animals of this invention are also useful in studying the 
tissue and temporal specific expression patterns of a non-native rhAR throughout the 
animals. The animals are also useful in determining the ability for various forms of 

15 wild-type and mutant alleles of a non-native rhAR to rescue the native AR null 
deficiency. The animals are also useful for identifying and studying the ability of a 
variety of compounds to act as modulators of the expression or activity of a non- 
native rhAR in vivo, or by providing cells for culture, for in vitro studies. 

Of particular interest are transgenic mice with rhAR where rhAR expression 

10 dominates mouse endogenous AR and can be turned on tissue specifically. 

As used herein, a "targeted gene" or "Knockout" ^O) is aDNA 
sequence introduced into the germline of a non-human animal by way of hxraian 
intervention, including but not limited to, the methods described herein. The targeted 
genes of the invention include nucleic acid sequences which are designed to 



wo 02/090529 



PCTAJS02/14175 



specifically alter cognate endogenous alleles. An altered AR gene should not fully 
encode the same AR as native to the host animal, and its expression product can be 
altered to a minor or great degree, or absent altogether. In cases where it is useful to 
express a non-native rhAR gene in a transgenic animal in the absence of a native AR 
5 gene we prefer that the altered AR gene induce a null lethal knockout phenotype in 
the animal. However a more modestly modified AR gene can also be useful and is 
within the scope of the present invention. 

A type of target cell for transgene introduction is the embryonic stem 
cell (ES). ES cells can be obtained from pre-implantation embryos cultured in vitro 

10 and fused with embryos (Evans et al., 1981, Nature 292:154-156; Bradley et al., 

1984, Nature 309:255-258; Gossler et al., 1986, Proc, Natl Acad. Sci. USA 83:9065- 
9069; and Robertson et al., 1986 Nature 322:445-448). Transgenes can be efficiently 
introduced into the ES cells by a variety of standard techniques such as DNA 
transfection, microinjection, or by retrovirus-mediated transduction. The resultant 

15 transformed ES cells can thereafter be combined with blastocysts from a non-human 
animal. The introduced ES cells thereafter colonize the embryo and contribute to the 
germ line of the resulting chimeric animal (Jaenisch, 1988, Science 240: 1468-1474). 

The methods for evaluating the targeted recombination events as well 
as the resulting knockout mice are readily available and known in the art. Such 

20 methods include, but are not limited to DNA (Southern) hybridization to detect the 
targeted allele, polymerase chain reaction (PGR), polyacrylamide gel electrophoresis 
(PAGE) and Western blots to detect DNA, RNA and protein. 

The present invention also relates to polyclonal and monoclonal 
antibodies raised in response to rhAR, or a biologically functional derivative thereof. 

25 In particular, antibodies to the A/B domain and the hinge domain, (D domain) are 

preferred. To this end, the DNA molecules, RNA molecules, recombinant protein and 
antibodies of the present invention may be used to screen and measure levels of 
rhAR. The recombinant proteins, DNA molecules, RNA molecules and antibodies 
lend themselves to the formulation of kits suitable for the detection and typing of 

30 rhAR. 

The present invention also relates assays utilized to identify 
compounds that modulate rhAR activity. One aspect of this portion of the invention 
is shown in Example Section 2, an in vitro binding assay using a GST-rhARLBD 
fusion protein. Other assays are contemplated, including but not limited to using 
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rhAR cDNA clones and/or expressed proteins in co-transfection assays to measure 
bioactivity of compounds, as well as mammalian two-hybrid assays to test the effect 
of compounds on NH2- and COOH-teraiinus interaction of Macaca mulatta AR. 

Such assays are described infra. 
5 It is an object of the present invention to provide an isolated nucleic 

acid molecule which encodes a novel form of a nuclear receptor protein such as 
human rhAR, human nuclear receptor protein fragments of fiill length proteins such 
as rhAR, and mutants which are derivatives of SEQ E) NOs:2 and 4. Any such 
polynucleotide includes but is not necessarily limited to nucleotide substitutions, 

10 deletions, additions, amino-tenninal truncations and carboxy-terminal truncations 
such that these mutations encode mRNA which express a protein or protein fragment 
of diagnostic, therapeutic or prophylactic use and would be useful for screening for 
agonists and/or antagonists for rhAR function. 

Another object of this invention is tissue typing using probes or 

15 antibodies of this invention. In a particular embodiment, polynucleotide probes are 
used to identify tissues expressing rhAR mRNA. In another embodiment, probes or 
antibodies can be used to identify a type of tissue based on rhAR expression or 
display of rhAR receptors. 

It is a further object of the present invention to provide rhAR proteins 

20 or protein fragments encoded by the nucleic acid molecules referred to in the 

preceding paragraphs, including such rhAR proteins which are expressed within host 
cells transfected with a DNA expression vector which contains an rhAR nucleotide 
sequence as disclosed herein. 

It is a further object of the present invention to provide recombinant 

25 vectors and recombinant host cells which comprise a nucleic acid sequence encoding 
rhAR or a biological equivalent thereof. 

It is an object of the present invention to provide a substantially 
purified form of rhAR, as set forth in SEQ ID N0s:2 and 4. 

It is an object of the present invention to provide for biologically 

30 functional derivatives of rhAR, including but not necessarily limited to amino acid 
substitutions, deletions, additions, amino terminal truncations and carboxy-terminal 
truncations such that these fragment and/or mutants provide for proteins or protein 
fragments of diagnostic, therapeutic or prophylactic use. 
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It is also an object of the present invention to provide for rhAR-based 
in-frame fusion constructions, methods of expressing these fusion constructions and 
biological equivalents disclosed herein, related assays, recombinant cells expressing 
these constructs, the expressed fusion proteins, and agonistic and/or antagonistic 
compounds identified through the use of DNA molecules encoding these rhAR-based 
fusion proteins. A preferred fusion construct is one which encodes all or a portion of 
the LBD and/or DBD regions of the rhAR open reading frame. A preferred fusion 
protein is one which is expressed from such a construct. 

It is also an object of the present invention to provide for assays to 
identify compounds which modulate rhAR activity. 

As used herein, " AR" refers to — androgen receptor — . 

As used herein, "rhAR" refers to - Macaca mulatta androgen receptor 

As used, herein, "DBD" refers to - DNA binding domain -. 

As used herein, "LBD" refers to — ligand binding domain 

As used herein, "SARM" refers to — selective androgen receptor 

modulator ~. 

As used herein, the term "mammalian host" refers to any mammal, 
including a human being. 

As used herein, "RlSSl" refers to methyltrieneolone, also known as 
17b-hydroxy-17-methylestra-4,9,ll-trien-3-one, the preparation of which is described 
in Vellux et al., 1963, Compt Rend. 257: 569 et seq. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 A-C shows the nucleotide sequence (SEQ ID NO: 1) which 
comprises the open reading frame encoding the rhAR. Underlined nucleotide 1051 
CA*) is the site of an allelic variant, which may also be represented by a*G* residue (as 
disclosed in SEQ ID NO:3). 

Figure 2 shows the amino acid sequence (SEQ ID NO: 2) of rhAR. 
The region in bold and underlined (from residue 535 to residue 600 of SEQ ID N0:2) 
is the DNA binding domain (DBD). Residue 210 (Glu residue also in bold and 
underlined) is the site of an allelic variant which may also be represented by a Gly 
residue (as encoded by SEQ ID NO:3 and disclosed herein as SEQ ID N0:4). 
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Figure 3A-F shows the coding (SEQ ID N0:1) and anticoding (SEQ 
ID N0:5) strands which comprises the open reading frame for the rhesus androgen 
receptor protein (SEQ ID N0:2). The underlined portion (i.e., from amino acid 
residue 535 to amino acid residue 600 of SEQ ID NO:2) represents the DBD region 
5 of expressed rhAR protein. 

DETAILED DESCRBPTION OF THE INVENTION 

The present invention relates to the identification and cloning of genes 
encoding full length Macaca mulatta androgen receptor (rhAR) and their use in the 
identification of tissue selective androgen compounds, including those active in bone 
formation, myoanabolism, treatment of sarcopenia, relief of post-menopausal 
symptoms, treatment of benign prostatic hyperplasia, treatment of acne, treatment of 
hirsutism, treatment of male hypogonadism, prevention and treatment of prostate 
cancer, management of lipids, treatment of atherosclerosis, prevention and treatment 
of breast cancer. The androgen receptor is a member of the nuclear receptor 
superfamily. The superfamily is composed of a group of stmcturally related receptors 
but regulated by chemically distinct Ugands. The common structure for them is a 
conserved DNA binding domain (DBD) located in the center of the peptide and a 
conserved ligand-binding domain (LBD) at the C-terminus. Eight out of the nine 
non-variant cysteines form two type n zinc fingers which distinguish them firom other 
DNA-binding proteins. 

The present invention relates to isolated nucleic acid molecules 
(polynucleotides) which encode novel Macaca mulatta (rhesus monkey) androgen 
receptor (rhAR). The isolated polynucleotides of the present invention encode a 
non-primate member of this nuclear receptor superfamily. The DNA molecules 
disclosed herein may be transfected into a host cell of choice wherein the 
recombinant host cell provides a source for substantial levels of an expressed, 
substantially purified, functional recombinant rhAR, which also forms a portion of 
the present invention. As noted herein, such a functional nuclear receptor will 
provide for an effective target for use in screening methodology to identify 
modulators of the androgen receptor, modulators which may be effective in regulating 
development, reproduction and maintenance of bone and muscle, treatment of 
prostate disease, regulation of lipid metabolism and hippocampal function. It is also 
known that abnormal function of AR can cause prostate cancer. Accumulated 
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information has also indicated that androgen deficiency results in various 
abnormalities of bone metabolism, such as increased bone loss. Androgen therapy 
has been used widely to treat a variety of disorders in both men and women. 
However, the development of an androgen modulator with desirable effect (i.e., bone 

5 promotion) and less side effect (i.e., aggressive behavior, acne) has not been 

achieved. Recent progress in hormone replacement therapy has proven the possibility 
in developing selective androgen receptor modulators fSARM s). J. of Chnical 
Endocrinologv & Metabolism, 84(10): 3459 (1999). Therefore, a compound 
screening system using AR, such as the rhAR disclosed herein, is needed for safe 

10 androgen drug development. 

A preferred embodiment of the present invention is disclosed in Figure 
lA-C and SEQ ID NO: 1, an isolated DNA molecule encoding rhAR. Nucleotide 
1051 is polymorphic, present as either a 'A' nucleotide or a 'G' nucleotide (see SEQ 
ID N0:3). This embodiment is shown as follows, with 105 1-A being bolded and 

15 underlined: 



1 


CCCAAAAJ^T 


AAiyVACAAAC 


AAAAACAAAA 


CAAAACAAAA 


AAAACGAATA 


51 


AAGAAAAAGG 


TAATAACTCA 


GTTCTTATTT 


GCACCTACTT 


CCAGTGGACA 


101 


CTGAATTTGG 


AAGGTGGAGG 


ATTCTTGTTT 


TTTCTTTTAA 


GATCGGGCAT 


151 


CTTTTGAATC 


TACCCCTCAA 


GT6TTAAGAG 


ACAGACTGTG 


AGCCTAGCAG 


201 


GGCAGATCTT 


GTCCACCGTG 


TGTCTTCTTT 


TGCAGGAGAC 


TTTGAGGCTG 


251 


TCAGAGCGCT 


TTTTGCGTGG 


TTGCTCCCGC 


AAGTTTCCTT 


CTCTGGAGCT 


301 


TCCCGCAGGT 


GGGCAGCTAG 


CTGCAGCGAC 


TACCGCATCA 


TCACAGCCTG 


351 


TTGAACTCTT 


CTGAGCAAGA 


GAAGGGGAGG 


CGGGGTAAGG 


GAAGTAGGTG 


401 


GAAGATTCAG 


CCAAGCTCAA 


GGATGGAGGT 


GCAGTTAGGG 


CTGGGGAGGG 


451 


TCTACCCTCG 


GCCGCCGTCC 


AAGACCTACC 


GAGGAGCTTT 


CCAGAATCTG 


501 


TTCCAGAGCG 


TGCGC6AAGT 


GATCCAGAAC 


CCGGGCCCCA 


GGCACCCAGA 


551 


GGCCGCGAGC 


GCAGCACCTC 


CCGGCGCCAG 


TTTGCAGCAG 


CAGCAGCAGC 


601 


AGCAGCAAGA 


AACTAGCCCC 


CGGCAACAGC 


AGCAGCAGCA 


GCAGGGTGAG 


651 


GATGGTTCTC 


CCCAAGCCCA 


TCGTAGAGGC 


CCCACAGGCT 


ACCTGGTCCT 


701 


GGATGAGGAA 


CAGCAGCCTT 


CACAGCCTCA 


GTCAGCCCCG 


GAGTGCCACC 


751 


CCGAGAGAGG 


TTGCGTCCCA 


GAGCCTGGAG 


CCGCCGTGGC 


CGCCGGCAAG 


801 


GGGCTGCCGC 


AGCAGCTGCC 


AGCACCTCCG 


GACGAGGATG 


ACTCAGCTGC 


851 


CCCATCCACG 


TTGTCTCTGC 


TGGGCCCCAC 


TTTCCCCGGC 


TTAAGCAGCT 


901 


GCTCCGCCGA 


CCTTAAAGAC 


ATCCTGAGCG 


AGGCCAGCAC 


CATGCAACTC 
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951 
1001 
1051 
1101 
5 1151 
1201 
1251 
1301 
1351 

10 1401 
1451 
1501 
1551 
1601 

15 1651 
1701 
1751 
1801 
1851 

20 1901 
1951 
2001 
2051 
2101 

25 2151 
2201 
2251 
2301 
2351 

30 2401 
2451 
2501 
2551 
2601 



CTTCAGCAAC 
AGC6A6GGAG 
AGGGCACTTC 
TCGGTGTCCA 
GGAACAGCTT 
CCGCTGTGCG 
CTGCTA6ACG 
CCCTTTCAAG 
GCTCTGGCAG 
ACCCTGTCTC 
GAGTCGCGAC 
CTCCACCGCC 
GACTATGGCA 
CCTGGCGAGC 
CCTCAGCGGC 
GGCCAGTTGT 
CGGCGGCGGC 
GGCCACCTCA 
GTGTGGTACC 
TTGTGTCAAA 
ACGGGGACAT 
TATTACTTTC 
TGGGTGTCAC 
AAAGAGCCGC 
TGCACTATTG 
GAAATGTTAT 
TTGGTAATCT 
CCCACTGAGG 
TGAATGTCAG 
TGGT6TGTGC 
CTCTCTAGCC 
GTGGGCCAAG 
TGGCTGTCAT 
TGGCGATCCT 



AGCAGCAGGA 
GCCTCGGGGG 
GACCATTTCT 
TGGGCTTGGG 
CGGGGGGATT 
TCCCACTCCG 
ACAGCGCAGG 
GGAGGTTACA 
CGCTGCAGCA 
TCTACAAGTC 
TACTACAACT 
TCCCCATCCC 
GCGCCTGGGC 
CTGCATGGCG 
CGCTTCCTCA 
ATGGACCGTG 
GCAGGCGAGG 
GGGGCTGGCG 
CTGGCGGCAT 
AGCGAGATGG 
GCGTTTGGAG 
CACCCCAGAA 
TATGGAGCTC 
TGAAGGGAAA 
ATAAATTCCG 
GAAGCAGGGA 
GAAACTACAG 
AGACAGCCCA 
CCCATCTTTC 
TGGACATGAC 
TCAATGAACT 
GCCTTGCCTG 
TCAGTACTCC 
TCACCAATGT 



AGCAGTATCC 
CTCCCACTTC 
GACAGCGCCA 
TGTGGAGGCG 
GCATGTACGC 
TGTGCCCCAT 
CAAGAGCACT 
CCAAAGGGCT 
GGGAGCTCCG 
CGGAGCACTG 
TTCCACTGGC 
CACGCTCGCA 
GGCTGCGGCG 
CGGGTGCAGC 
TCCTGGCACA 
TGGTGGTGGG 
CGGGAGCT6T 
GGCCAGGAAG 
GGTGAGCAGA 
GCCCCTGGAT 
ACTGCCAGGG 
GACCTGCCTG 
TCACATGTGG 
CAGAAGTACC 
AAG6AAAAAT 
TGACTCTGGG 
GAGGAAGGAG 
GAAGCTGACA 
TGAATGTCCT 
AACAACCAGC 
GGGAGAGA6A 
GCTTCCGCAA 
TGGATGGGGC 
CAACTCCAGG 



GAAGGCAGCA 
CTCCAAGGAC 
AGGAGCTGT6 
TTGGAGCATC 
CCCAGTTTTG 
TGGCCGAATG 
GAAGATACTG 
AGAAG6CGAG 
GGACACTTGA 
GACGAGGCAG 
TCTGGCCGGG 
TCAA6CTGGA 
GCGCAGTGCC 
GGGACCCGGC 
CTCTCTTCAC 
GGCGGCGGC6 
AGCCCCCTAC 
GCGACTTCAC 
GTGCCCTATC 
GGATAGCTAC 
ACCATGTTTT 
ATCTGTGGAG 
AAGCTGCAAG 
TGTGTGCCAG 
TGTCCATCTT 
AGCCCGGAAG 
AGGCTTCCAG 
GTGTCACACA 
GGAGGCCATT 
CCGACTCCTT 
CAGCTTGTAC 
CTTACACGTG 
TCATGGTGTT 
ATGCTCTACT 



GCAGCGGGAG 
AATTACTTAG 
TAAGGCAGTG 
TGAGTCCAGG 
GGAGTTCCAC 
CAAAGGTTCT 
CTGAGTATTC 
AGCCTAGGCT 
ACTGCCGTCC 
CTGCGTACCA 
CCGCCGCCCC 
GAACCCGCTG 
GCTATGGGGA 
TCTGGGTCAC 
AGCCGAAGAA 
GTGGCGGCGG 
GGCTACACTC 
CGCACCTGAT 
CCAGTCCCAC 
TCC6GACCTT 
GCCAATTGAC 
ATGAAGCTTC 
GTCTTCTTCA 
CAGAAATGAT 
GCCGTCTTCG 
CTGAAGAAAC 
CACCACCAGC 
TTGAAGGCTA 
GAGCCAGGTG 
CGCAGCCTTG 
ATGTGGTCAA 
GACGACCAGA 
TGCCATGGGC 
TTGCCCCTGA 
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2651 


TCTGGTTTTC 


AATGAGTACC 


GCATGCACAA 


ATCCCGGATG 


TACAGCCAGT 


2701 


GTGTCCGAAT 


GAGGCACCTC 


TCTCAAGAGT 


TTGGATGGCT 


CCAAATCACC 


2751 


CCCCAGGAAT 


TCCTGTGCAT 


GAAAGCGCTG 


CTACTCTTCA 


GCATTATTCC 


2801 


AGTGGATGGG 


CTGAAAAATC 


AAAAATTCTT 


TGATGAACTT 


CGAATGAACT 


2851 


ACATCAAGGA 


ACTCGATCGT 


ATCATTGCAT 


GCAAAAGAAA 


AAATCCCACA 


2901 


TCCTGCTCAA 


GGCGTTTCTA 


CCAGCTCACC 


AAGCTCCTGG 


ACTCCGTGCA 


2951 


GCCTATTGCG 


AGAGAGCTGC 


ATCAGTTCAC 


TTTTGACCTG 


CTAATCAAGT 


3001 


CACACATGGT 


GAGCGTGGAC 


TTTCCGGAAA 


TGATGGCAGA 


GATCATCTCT 


3051 


GTGCAAGTGC 


CCAAGATCCT 


TTCTGGGAAA 


GTCAAGCCCA 


TCTATTTCCA 


3101 


CACCCAGTGA 


AGCATTGGAA 


ATCCCTATTT 


CCTCACCCCA 


GCTCATGCCC 


3151 


CCTTTCAGAT 


GTCTTCTGCC 


TGTTA (SEQ 


ID N0:1) . 





As noted above, nucleotide 1051 represents a single nucleotide polymorphism 
(SNP). To this end, another preferred embodiment of the present invention is an 
isolated DNA molecule as shown in Figure lA-C and SEQ ID N0:1, except 
nucleotide 1051 is a trinucleotide instead of a 'A* nucleotide, this isolated DNA 



molecule being additionally disclosed as SEQ ID NO:3, as follows, with 105 1-G 



being bolded and underline± 








1 


CCCAAAAAAT 


AAAAACAAAC 


AAAAACAAAA 


CAAAACAAAA 


AAAACGAATA 


51 


AAGAAAAAGG 


TAATAACTCA 


GTTCTTATTT 


GCACCTACTT 


CCAGTGGACA 


101 


CTGAATTTGG 


AAGGTGGAGG 


ATTCTTGTTT 


TTTCTTTTAA 


GATCGGGCAT 


151 


CTTTTGAATC 


TACCCCTCAA 


GTGTTAAGAG 


ACAGACTGTG 


AGCCTAGCAG 


201 


GGCAGATCTT 


GTCCACCGTG 


TGTCTTCTTT 


TGCAGGAGAC 


TTTGAGGCTG 


251 


TCAGAGCGCT 


TTTTGCGTGG 


TTGCTCCCGC 


AAGTTTCCTT 


CTCTGGAGCT 


301 


TCCCGCAGGT 


GGGCAGCTAG 


CTGCAGCGAC 


TACCGCATCA 


TCACAGCCT6 


351 


TTGAACTCTT 


CTGAGCAAGA 


GAAGGGGAGG 


CGGGGTAAGG 


GAAGTAGGTG 


401 


GAAGATTCAG 


CCAAGCTCAA 


GGATGGAGGT 


GCAGTTAG6G 


CTGGGGAGG6 


451 


TCTACCCTCG 


GCCGCCGTCC 


AAGACCTACC 


GAGGAGCTTT 


CCAGAATCTG 


501 


TTCCAGAGCG 


TGCGCGAAGT 


GATCCAGAAC 


CCGGGCCCCA 


GGCACCCAGA 


551 


GGCCGCGAGC 


GCAGCACCTC 


CCGGCGCCAG 


TTTGCAGCAG 


CAGCAGCAGC 


601 


AGCAGCl^GA 


AACTAGCCCC 


CGGCAACAGC 


AGCAGCAGCA 


GCAGGGTGA6 


651 


GATGGTTCTC 


CCCAAGCCCA 


TCGTAGAGGC 


CCCACAGGCT 


ACCTGGTCCT 


701 


GGATGAGGAA 


CAGCAGCCTT 


CACAGCCTCA 


GTCAGCCCCG 


GAGTGCCACC 


751 


CCGAGAGAGG 


TTGCGTCCCA 


GAGCCTGGAG 


CCGCCGTGGC 


CGCCGGCAAG 


801 


GGGCTGCCGC 


AGCAGCTGCC 


AGCACCTCCG 


GACGAGGATG 


ACTCAGCTGC 
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851 
901 
951 
1001 
5 1051 
1101 
1151 
1201 
1251 

10 1301 
1351 
1401 
1451 
1501 

15 1551 
1601 
1651 
1701 
1751 

20 1801 
1851 
1901 
1951 
2001 

25 2051 
2101 
2151 
2201 
2251 

30 2301 
2351 
2401 
2451 
2501 



CCCATCCACG 
GCTCCGCCGA 
CTTCAGCAAC 
AGC6AGG6AG 
GGGGCACTTC 
TCGGTGTCCA 
GGAACAGCTT 
CCGCTGTGCG 
CTGCTAGACG 
CCCTTTCAAG 
GCTCTGGCAG 
ACCCTGTCTC 
GAGTCGCGAC 
CTCCACCGCC 
GACTATGGCA 
CCTGGCGAGC 
CCTCAGCGGC 
GGCCAGTTGT 
CGGCGGCGGC 
GGCCACCTCA 
GTGTGGTACC 
TTGTGTCAAA 
ACGGGGACAT 
TATTACTTTC 
TGGGTGTCAC 
AAAGAGCCGC 
TGCACTATTG 
GAAATGTTAT 
TTG6TAATCT 
CCCACTGAGG 
TGAATGTCAG 
TGGTGT6TGC 
CTCTCTAGCC 
GTGGGCCAAG 



TTGTCTCTGC 
CCTTAAAGAC 
AGCAGCAGGA 
GCCTCGGGGG 
GACCATTTCT 
TGGGCTTGGG 
CGGGGGGATT 
TCCCACTCCG 
ACAGCGCAGG 
GGAGGTTACA 
CGCTGCAGCA 
TCTACAAGTC 
TACTACAACT 
TCCCCATCCC 
GCGCCTGGGC 
CTGCATGGCG 
CGCTTCCTCA 
ATGGACCGTG 
GCAGGCGAG6 
GGGGCTGGCG 
CTGGCGGCAT 
AGCGAGATGG 
GCGTTTGGA6 
CACCCCAGAA 
TATGGAGCTC 
TGAAGGGAAA 
ATAAATTCCG 
GAAGCAGGGA 
GAAACTACA6 
AGACAGCCCA 
CCCATCTTTC 
TGGACATGAC 
TCAATGAACT 
GCCTTGCCTG 



TGGGCCCCAC 
ATCCTGAGCG 
AGCAGTATCC 
CTCCCACTTC 
GACAGCGCCA 
TGTGGAGGCG 
GCATGTACGC 
TGTGCCCCAT 
CAAGAGCACT 
CCAAAGGGCT 
GGGAGCTCCG 
CGGAGCACTG 
TTCCACTGGC 
CACGCTCGCA 
GGCTGCGGCG 
CGGGTGCAGC 
TCCTGGCACA 
TGGTGGTGGG 
CGGGAGCTGT 
GGCCAGGAAG 
GGTGAGCAGA 
GCCCCTGGAT 
ACTGCCAGGG 
GACCTGCCTG 
TCACATGTGG 
CA6AAGTACC 
AAGGAAAAAT 
TGACTCTGGG 
GAGGAAGGA6 
GAAGCTGACA 
TGAAT6TCCT 
AACAACCAGC 
GGGA6AGAGA 
GCTTCCGCAA 



TTTCCCCGGC 
AGGCCAGCAC 
GAAGGCAGCA 
CTCCAAGGAC 
AGGAGCTGTG 
TTGGAGCATC 
CCCAGTTTTG 
TGGCCGAATG 
GAAGATACTG 
AGAAGGCGAG 
GGACACTTGA 
GACGAGGCAG 
TCTGGCCGGG 
TCAAGCTGGA 
GCGCAGTGCC 
GGGACCCGGC 
CTCTCTTCAC 
GGCGGCGGCG 
AGCCCCCTAC 
GCGACTTCAC 
GTGCCCTATC 
GGATAGCTAC 
ACCATGTTTT 
ATCTGTGGAG 
AAGCTGCAAG 
TGTGTGCCAG 
TGTCCATCTT 
AGCCCGGAAG 
AGGCTTCCAG 
GTGTCACACA 
GGAGGCCATT 
CCGACTCCTT 
CA6CTTGTAC 
CTTACACGTG 



TTAAGCAGCT 
CATGCAACTC 
GCAGCGGGAG 
AATTACTTAG 
TAAGGCAGTG 
TGAGTCCAGG 
GGAGTTCCAC 
CAAAGGTTCT 
CTGAGTATTC 
AGCCTAGGCT 
ACTGCCGTCC 
CTGCGTACCA 
CCGCCGCCCC 
GAACCCGCTG 
GCTATGGGGA 
TCTGGGTCAC 
A6CCGAAGAA 
GTGGCGGCGG 
GGCTACACTC 
CGCACCTGAT 
CCAGTCCCAC 
TCCGGACCTT 
GCCAATTGAC 
ATGAAGCTTC 
GTCTTCTTCA 
CAGAAATGAT 
GCCGTCTTCG 
CTGAAGAAAC 
CACCACCAGC 
TT6AAGGCTA 
GAGCCAGGTG 
CGCAGCCTTG 
ATGTGGTCAA 
GACGACCAGA 
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2551 


TGGCTGTCAT 


TCAGTACTCC 


TGGATGGGGC 


TCATGGTGTT 


TGCCATGGGC 


2601 


TGGCGATCCT 


TCACCAATGT 


CAACTCCAGG 


ATGCTCTACT 


TTGCCCCTGA 


2651 


TCTGGTTTTC 


AATGAGTACC 


GCATGCACAA 


ATCCCGGATG 


TACAGCCAGT 


2701 


GTGTCCGAAT 


GAGGCACCTC 


TCTCAAGAGT 


TTG6ATGGCT 


CCAAATCACC 


2751 


CCCCAGGAAT 


TCCTGTGCAT 


GAAAGCGCTG 


CTACTCTTCA 


GCATTATTCC 


2801 


AGTGGATGGG 


CTGAAAAATC 


AAAAATTCTT 


TGATGAACTT 


CGAATGAACT 


2851 


ACATCAA66A 


ACTCGATCGT 


ATCATTGCAT 


GCAAAAGAAA 


AAATCCCACA 


2901 


TCCTGCTCAA 


GGCGTTTCTA 


CCAGCTCACC 


AAGCTCCTGG 


ACTCCGTGCA 


2951 


GCCTATTGCG 


AGAGAGCTGC 


ATCAGTTCAC 


TTTTGACCTG 


CTAATCAAGT 


3001 


CACACATGGT 


GAGCGTGGAC 


TTTCCGGAAA 


TGATGGCAGA 


GATCATCTCT 


3051 


GTGCAAGTGC 


CCAAGATCCT 


TTCTGGGAAA 


GTCAAGCCCA 


TCTATTTCCA 


3101 


CACCCAGTGA 


AGCATTGGAA 


ATCCCTATTT 


CCTCACCCCA 


GCTCATGCCC 


3151 


CCTTTCAGAT 


GTCTTCTGCC 


TGTTA (SEQ 


ID N0:3) . 





The above-exemplified isolated DNA molecules, comprise the 
following characteristics: 

(SEQ ID N0:1) - 3175 nuc. initiating Met (nuc. 423-425) and "TCA" terai. codon 
(nuc.3 106-3 108), with a polymorphic site at nucleotide 1051 ('A*), the open reading 
frame resulting in an expressed protein of 895 amino acids, as set forth in SEQ ID 
NO:2, with amino acid residue 210 being a Glu (E) residue. 
(SEQ ID NO:3) - 3175 nuc.iinitiating Met (nuc. 423-425) and "TCA" term, codon 
(nuc.3 106-3 108), with a polymorphic site at nucleotide 1051 (Xj% the open reading 
frame resulting in an expressed protein of 895 amino acids, as set forth in SEQ ID 
NO:4, with amino acid residue 210 being a Gly (G) residue. 

The present invention also relates to isolated nucleic acid fragments 
which encode mRNA expressing a biologically active rhesus monkey androgen 
receptor which belongs to the nuclear receptor superfamily. A preferred embodiment 
relates to isolated nucleic acid fragments of SEQ ID NOs:l and 3 which encode 
mRNA expressing a biologically functional derivative of rhAR. Any such nucleic 
acid fragment will encode either a protein or protein fragment comprising at least an 
intracellular DNA-binding domain and/or ligand binding domain, domains conserved 
throu^out the rhAR nuclear receptor family domain which exist in rhAR (SEQ ID 
NOs: 2 and 4). Any such polynucleotide includes but is not necessarily limited to 
nucleotide substitutions (including but not limited to SNPs, such as single nucleotide 
substitutions as disclosed herein, as well as deletion and/or insertions which fall 
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within the known working definition of a SNP), deletions, additions, anaino-tenninal 
truncations and carboxy-terminal truncations such that these mutations encode 
mRNA which express a protein or protein fragment of diagnostic, therapeutic or 
prophylactic use and would be useful for screening for agonists and/or antagonists of 
5 rhAR. 

The isolated nucleic acid molecule of the present invention may 

include a deoxyribonucleic acid molecule (DNA), such as genomic DNA and 

complementary DNA (cDNA), which may be single (coding or noncoding strand) or 

double stranded, as well as synthetic DNA, such as a synthesized, single stranded 
10 polynucleotide. The isolated nucleic acid molecule of the present invention may also 

include a ribonucleic acid molecule (RNA). The preferred template is DNA. 

It is known that there is a substantial amount of redundancy in the 

various codons which code for specific amino acids. Therefore, this invention is also 

directed to those DNA sequences encode RNA comprising alternative codons that 
15 code for the eventual translation of the identical amino acid, as shown below: 

A=Ala=Alamne: codons GCA, GCC, GCG, GCU 

C=Cys=Cysteine: codons UGC, UGU 

D=Asp=Aspartic acid: codons GAG, GAU 

E=Glu=Glutamic acid: codons GAA, GAG 
20 F=Phe=Phenylalanine: codons UUC, UUU 

G=Gly=Glycine: codons GGA, GGC, GGG, GGU 

H=His =Histidine: codons CAC, CAU 

I=Ile =Isoleucine: codons AUA, AUG, AUU 

K=Lys=Lysine: codons AAA, AAG 
25 L=Leu=Leucine: codons UUA, UUG, CUA, CUC, CUG, CUU 

M=Met=Methionine: codon AUG 

N=Asp=Asparagine: codons AAC, AAU 

P=Pro=Proline: codons CCA, CCC, CCG, CCU 

Q=Gln=Glutamine: codons CAA, GAG 
30 R=Arg=Arginine: codons AGA, AGG, CGA, CGC, CGG, CGU 

S=Ser=Serine: codons AGC, AGU, UCA, UCC, UCG, UCU 

T=Thr=Threonine: codons ACA, ACC, ACG, ACU 

V=Val=Valine: codons GUA, GUC, GUG, GUU 

W=Trp=:Tryptophan: codon UGG 
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Y=Tyit=Tyrosine: codons UAC, UAU. 

Therefore, the present invention discloses codon redundancy that may result in 
differing DNA molecules expressing an identical protein. For purposes of this 
specification, a sequence bearing one or more replaced codons will be defined as a 
5 degenerate variation. Also included within the scope of this invention are mutations 
either in the DNA sequence or the translated protein, which do not substantially alter 
the ultimate physical properties of the expressed protein. For example, substitution of 
valine for leucine, arginine for lysine, or asparagine for glutamine may not cause a 
change in functionality of the polypeptide. 

10 It is known that DNA sequences coding for a peptide may be altered 

so as to code for a peptide having properties that are different than those of the 
naturally occurring peptide. Methods of altering the DNA sequences include but are 
not lunited to site directed mutagenesis. Examples of altered properties include but 
are not limited to changes in the affinity of an enzyme for a substrate or a receptor for 

IS a ligand. 

As used herein, "purified" and "isolated" may be utilized 
interchangeably to stand for the proposition that the nucleic acid, protein, or 
respective fragment thereof in question has been substantially removed from its in 
vivo environment so that it may be manipulated by the skilled artisan, such as but not 

20 limited to nucleotide sequencing, restriction digestion, site-directed mutagenesis, and 
subcloning into expression vectors for a nucleic acid fragment as well as obtaining 
the protein or protein fragment in pure quantities so as to afford the opportunity to 
generate polyclonal antibodies, monoclonal antibodies, amino acid sequencing, and 
peptide digestion. Therefore, the nucleic acids claimed herein may be present in 

25 whole cells or in cell lysates or in a partially purified or substantially purified form. 
A nucleic acid is considered substantially purified when it is purified away from 
environmental contaminants. Thus, a nucleic acid sequence isolated from cells is 
considered to be substantially purified when purified from cellular components by 
standard methods while a chemically synthesized nucleic acid sequence is considered 

30 to be substantially purified when purified from its chemical precursors. 

Any of a variety of procedures may be used to clone rhAR. These 
methods include, but are not limited to, (1) a RACE PGR cloning technique 
(Frohman, et al., 1988, Proc. Natl Acad. Sci, USA 85: 8998-9002). 5' and/or 3' 
RACE may be performed to generate a full length cDNA sequence. This strategy 
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involves using gene-specific oligonucleotide primers for PGR amplification of rhAR 
cDNA. These gene-specific primers are designed through identification of an 
expressed sequence tag (EST) nucleotide sequence which has been identified by 
searching any number of publicly available nucleic acid and protein databases; (2) 
5 direct functional expression of the rhAR following the construction of a rhAR- 

containing cDNA library in an appropriate expression vector system; (3) screening a 
rhAR-containing cDNA library constructed in a bacteriophage or plasmid shuttle 
vector with a labeled degenerate oligonucleotide probe designed from the anodno acid 
sequence of the rhAR protein; (4) screening a rhAR-containing cDNA library 

10 constructed in a bacteriophage or plasmid shuttle vector with a partial cDNA 
encoding the rhAR protein. This partial cDNA is obtained by the specific PGR 
amplification of rhAR DNA fragments through the design of degenerate 
oligonucleotide primers firom the amino acid sequence known for other nuclear 
receptors which are related to the rhAR protein; (5) screening a rhAR-containing 

15 cDNA library constructed in a bacteriophage or plasmid shuttle vector with a partial 
cDNA encoding the rhAR protein. This strategy may also involve using gene- 
specific oligonucleotide primers for PGR amplification of rhAR cDNA identified as 
an EST as described above; or (6) designing 5' and 3' gene specific oligonucleotides 
using SEQ ID N0:1 or 3 as a template so that either the full-length cDNA may be 

20 generated by known PGR techniques, or a portion of the coding region may be 

generated by these same known PGR techniques to generate and isolate a portion of 
the coding region to use as a probe to screen one of numerous types of cDNA and/or 
genomic libraries in order to isolate a full-length version of the nucleotide molecule 
encoding rhAR. 

25 It is readily apparent to those ordinarily skilled in the art that other 

types of libraries, as well as libraries constructed firom other cell types-or species 
types, may be useful for isolating a rhAR-encoding DNA or a rhAR homologue. 
Other types of libraries include, but are not limited to, cDNA libraries derived from 
other cells or cell lines other than rhAR cells or tissue such as murine cells, rodent 

30 cells or any other such vertebrate host which may contain riiAR-encoding DNA. 
Additionally a rhAR gene and homologues may be isolated by oligonucleotide- or 
polynucleotide-based hybridization screening of a vertebrate genomic library, 
including but not limited to, a murine genomic library, a rodent genomic library, as 
well as concomitant rhAR genomic DNA libraries. 
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It is readily apparent to those skilled in the art that suitable cDNA 
libraries may be prepared from cells or cell lines which have rh AR activity. The 
selection of cells or cell lines for use in' preparing a cDNA library to isolate a cDNA 
encoding rhAR may be done by first measuring cell-associated rhAR activity using 

5 any known assay available for such a purpose. 

Preparation of cDNA libraries can be performed by standard 
techniques well known in the art. Well known cDNA library construction techniques 
can be found for example, in Sambrook et al., 1989, Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. 

10 Complementary DNA libraries may also be obtained from numerous commercial 
sources, including but not limited to Clontech Laboratories, Inc. and Stratagene. 

It is also readily apparent to those skilled in the art that DNA encoding 
rhAR may also be isolated from a suitable genomic DNA library. Constraction of 
genomic DNA libraries can be performed by standard techniques well known in the 

15 art. Well known genomic DNA library construction techniques can be found in 
Sambrook, et al., supra. 

In order to clone the rhAR gene by one of the preferred methods, the 
amino acid sequence or DNA sequence of rhAR or a homologous protein may be 
necessary. To accomplish this, the rfiAR protein or a homologous protein may be 

20 purified and partial amino acid sequence determined by automated sequenators or 
mass spectroscopy. It is not necessary to determine the entire amino acid sequence, 
but the linear sequence of two regions of 6 to 8 amino acids can be determined for the 
PGR amplification of a partial rhAR DNA fragment. Once suitable amino acid 
sequences have been identified, the DNA molecules capable of encoding them are 

25 synthesized. Because the genetic code is degenerate, more than one codon may be 
used to encode a particular amino acid, and therefore, the amino acid sequence can be 
encoded by any of a set of similar DNA oligonucleotides. Only one member of the 
set will be identical to the rhAR sequence but others in the set will be capable of 
hybridizing to rhAR DNA even in the presence of DNA oligonucleotides with 

30 mismatches. The mismatched DNA oligonucleotides may still sufficiently hybridize 
to the rhAR DNA to permit identification and isolation of rhAR encoding DNA. 
Alternatively, the nucleotide sequence of a region of an expressed sequence may be 
identified by searching one or more available genomic databases. Gene-specific 
primers may be used to perform PGR amplification of a cDNA of interest from either 
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a cDNA library or a population of cDNAs. As noted above, the appropriate 
nucleotide sequence for use in a PCR-based method may be obtained from SEQ ID 
NO: 1 or 18-20, either for the purpose of isolating overlapping 5' and 3' RACE 
products for generation of a full-length sequence coding for rhAR, or to isolate a 
5 portion of the nucleotide molecule coding for rhAR for use as a probe to screen one 
or more cDNA- or genomic-based libraries to isolate a full-length molecule encoding 
rhAR or rhAR-like proteins. 

In an exemplified method, the rhAR fuU-lengtfi cDNA of the present 
invention was isolated by screening template cDNA synthesized from Macaca 
10 mulatta prostate mRNA. Oligonucleotide primers based on Macaca fascicularis AR 
were synthesized. Template cDNA was synthesized from Macaca mulatta prostate 
mRNA. NH2 portion and COOH-portion primer pairs were used to generate two 

PGR fragments, which were subcloned, characterized and assembled into a ftiU length 
DNA sequence (see SEQ ID NOs: 1 and 3). The cloned Macaca mulatta AR cDNA 

15 has 7 nucleotide differences from Macaca fascicularis AR in the coding region which 
result in two amino acid residues difference (Fig. 4). The two macaque polyQ and 
polyG sequences are identical to each other, and are in turn shorter than the 
corresponding hiraian sequences. A single amino acid difference between the 
macaque and human AR, [Ala-632], is present in the DBD-Hinge-LBD region. 

20 The present invention also relates to recombinant vectors and 

recombinant hosts, both prokaryotic and eukaryotic, which have been transfected 
and/ortransformed with the nucleic acid molecules disclosed throughout this 
specification. 

The present invention also relates to methods of expressing rhAR and 
25 biological equivalents disclosed herein, the expressed, processed form of the protein, 
assays employing these recombinantly expressed gene products, cells expressing 
these gene products, and agonistic and/or antagonistic compounds identified through 
the use of assays utilizing these recombinant forms, including, but not limited to, one 
or more modulators of rhAR, either through direct contact with the LBD or through 
30 direct or indirect contact with a ligand which either interacts with the DBD or with 
the wild-type transcription complex which the androgen receptor interacts in trans, 
thereby modulating bone biology, for example. 

The present invention relates to methods of expressing rhAR in 
recombinant systems and of identifying agonists and antagonists of rhAR. The novel 
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rhAR proteins of the present invention are suitable for use in an assay procedure for 
the identification of compounds which modulate the transactivation activity of 
mammalian rhAR. Modulating rhAR activity, as described herein includes the 
inhibition or activation of this soluble transacting factor and therefore includes 

5 directly or indirectly affecting the normal regulation of the rhAR activity. 

Compounds that modulate rhAR include agonists, antagonists and compounds which 
directly or indirectly affect regulation of rhAR. When screening compounds in order 
to identify potential pharmaceuticals that specifically interact with a target protein, it 
is necessary to ensure that the compounds identified are as specific as possible for the 

10 target protein. To do this, it may necessary to screen the compounds against as wide 
an array as possible of proteins that are similar to the target receptor, including 
species homologous to rhesus androgen receptor. Thus, in order to find compounds 
that are potential pharmaceuticals that interact with rhAR, it is necessary not only to 
ensure that the compounds interact with rhAR (the "plus target") and produce the 

15 desired pharmacological effect through rhAR, it is also necessary to determine that 
the compounds do not interact with proteins B, C, D, etc. (the "minus targets"). In 
general, as part of a screening program, it is important to have as many minus targets 
as possible (see Hodgson, 1992, Bio/Technology 10:973-980, @ 980). rhAR proteins 
and the DNA molecules encoding this protein may serve this purpose in assays 

20 utilizing, for example, other members of the nuclear receptor superf amily. 

As used herein, a "biologically functional derivative" of a wild-type 
rhAR possesses a biological activity that is related to the biological activity of the 
wild type rhAR. The term "functional derivative" is intended to include the 
"fragments," "mutants," "variants," "degenerate variants," "analogs" and 

25 "homologues" of the wild type rhAR protein. The term "fragment" is meant to refer 
to any polypeptide subset of wild-type rhAR, including but not necessarily limited to 
rhAR proteins comprising amino acid substitutions, deletions, additions, amino 
terminal truncations and/or carboxy-terminal truncations. The temi "mutant" is 
meant to refer a subset of a biologically active fragment that may be substantially 

30 similar to the wild-type form but possesses distinguishing biological characteristics. 
Such altered characteristics include but are in no way limited to altered substrate 
binding, altered substrate affinity and altered sensitivity to chemical compounds 
affecting biological activity of the rhAR or a rhAR functional derivative. The term 
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"variant" is meant to refer to a molecule substantially similar in structure and function 
to either the wild-type protein or to a fragment thereof. 

A variety of mammalian expression vectors may be used to express 
recombinant rhAR in manraialian cells. Expression vectors are defined herein as 
5 DNA sequences that are required for the transcription of cloned DNA and the 
translation of their mRNAs in an appropriate host. Such vectors can be used to 
express eukaryotic DNA in a variety of hosts such as bacteria, blue green algae, plant 
cells, insect cells and animal cells. Specifically designed vectors allow the shuttling . 
of DNA between hosts such as bacteria-yeast or bacteria-animal cells. An 

10 appropriately constructed expression vector should contain: an origin of replication 
for autonomous replication in host cells, selectable markers, a limited number of 
useful restriction enzyme sites, a potential for high copy number, and active 
promoters. A promoter is defined as a DNA sequence that directs RNA polymerase 
to bind to DNA and initiate RNA synthesis. A strong promoter is one that causes 

15 mRNAs to be initiated at high firequency. Expression vectors may include, but are 
not limited to, cloning vectors, modified cloning vectors, specifically designed 
plasmids or viruses. 

Commercially available mammalian expression vectors which may be 
suitable for recombinant rhAR expression, include but are not limited to, pcDNAS.l 

20 (Invitrogen), pLnMUS28, pLITMUS29, pLTIMUSSS and pLrrMUS39 (New 
England Bioloabs), pcDNAI, pcDNAIamp (Invitrogen), pcDNA3 (Invitrogen), 
pMClneo (Stratagene), pXTl (Stratagene), pSG5 (Stratagene), EBO-pSV2-neo 
(ATCC 37593) pBPV-l(8-2) (ATCC 37110), pdBPV-MMTneo(342-12) (ATCC 
37224), pRSVgpt (ATCC 37199), pRSVneo (ATCC 37198), pSV2-dhfr (ATCC 

25 37146), pUCTag (ATCC 37460), and 1ZD35 (ATCC 37565). 

A variety of bacterial expression vectors may be used to express 
recombinant rhAR in bacterial cells. Conmiercially available bacterial expression 
vectors which may be suitable for recombinant rhAR expression include, but are not 
limited to pCRU (Invitrogen), pCR2.1 (Invitrogen), pQE (Qiagen), pETlla 

30 (Novagen), lambda gtll (Invitrogen), pKK223-3 (Pharmacia), and pGEX2T 
(Pharmacia). 

A variety of fungal cell expression vectors may be used to express 
recombinant rhAR in fungal cells. Commercially available fungal cell expression 
vectors which may be suitable for recombinant rhAR expression include but are not 
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limited to the ESP * yeast expression system, which utilizes S, pombe as the 
expression host, pYES2 (Invitrogen) and Pichia expression vector (Invitrogen). 

A variety of insect cell expression vectors may be used to express 
recombinant receptor in insect cells. Commercially available insect cell expression 
5 vectors which may be suitable for recombinant expression of rhAR include but are 
not limited to pBlueBacIII and pBlueBacHis2 (Invitrogen), and pAcG2T 
(Pharmingen). 

An expression vector containing DNA encoding a rhAR or rhAR-like 
protein may be used for expression of rhAR in a recombinant host cell. Recombinant 

10 host cells may be prokaryotic or eukaryotic, including but not limited to bacteria such 
as E. colU fungal cells such as yeast, mammalian cells including but not limited to cell 
lines of rhAR, bovine, porcine, monkey and rodent origin, and insect cells including 
but not limited to Drosophila- and silkworm-derived cell lines. Cell lines derived 
from mammalian species which may be suitable and which are commercially 

15 available, include but are not Hmited to, L cells L-M(TK-) (ATCC CCL 1 .3), L cells 
. L-M (ATCC CCL 1.2), Saos-2 (ATCC HrB-85), 293 (ATCC CRL 1573), Raji 

(ATCC CCL 86), CV-1 (ATCC CCL 70), COS-1 (ATCC CRL 1650), COS-7 (ATCC 
CRL 1651), CHO-Kl (ATCC CCL 61), 3T3 (ATCC CCL 92), NIH/3T3 (ATCC CRL 
1658), HeLa (ATCC CCL 2), C127I (ATCC CRL 1616), BS-C-1 (ATCC CCL 26), 

20 MRC-5 (ATCC CCL 171) and CPAE (ATCC CCL 209). 

The expression vector may be introduced into host cells via any one of 
a number of techniques including but not limited to transfection, transformation, 
protoplast fusion, and electroporation. The expression vector-containing cells are 
individually analyzed to determine whether they produce rhAR protein. 

25 Identification of rhAR expressing cells may be done by several means, including but 
not limited to immunological reactivity with anti-rhAR antibodies, labeled ligand 
binding and the presence of host cell-associated rhAR activity. 

The cloned rhAR cDNA obtained through the methods described 
above may be recombinantly expressed by molecular cloning into an expression 

30 vector (such as pcDNA3.1, pQE, pBlueBacHis2 and pLnMUS28) containing a 
suitable promoter and other appropriate transcription regulatory elements, and 
transferred into prokaryotic or eukaryotic host cells to produce recombinant rhAR. 
Techniques for such manipulations can be found described in Sambrook, et al., supra 
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, are discussed at length in the Example section and are well known and easily 
available to the artisan of ordinary skill in the art. 

Expression of rhAR DNA may also be performed using in vitro 
produced synthetic mRNA. Synthetic mRNA can be efficiently translated in various 
5 cell-free systems, including but not limited to wheat germ extracts and reticulocyte 
extracts, as well as efficiently translated in cell based systems, including but not 
limited to niicroinjection into frog oocytes, with microinjection into frog oocytes 
being preferred. 

To determine the ihAR cDNA sequence(s) that yields optimal levels of 

10 rhAR, cDNA molecules including but not limited to the following can be constructed: 
a cDNA fragment containing the full-length open reading frame for rhAR as well as 
various constracts containing portions of the cDNA encoding only specific domains 
of the protein or rearranged domains of the protein. All constracts can be designed to 
contain none, all or portions of the 5' and/or 3' untranslated region of a rhAR cDNA. 

15 The expression levels and activity of rhAR can be determined following the 

introduction, both singly and in combination, of these constructs into appropriate host 
cells. Following determination of the rhAR cDNA cassette yielding optimal 
expression in transient assays, this rhAR cDNA construct is transferred to a variety of 
expression vectors (including recombinant viruses), including but not limited to those 

20 for manDomalian cells, plant cells, insect cells, oocytes, bacteria, and yeast cells. 

A preferred aspect of the present invention relates to a substantially 
purified form of the novel nuclear trans-acting receptor protein, a rhesus androgen 
receptor protein, which is disclosed in Figures 2 (SEQ ID N0:2) as well as a 
polymorph of the protein disclosed in SEQ ID NO:2, disclosed herein as SEQ ID 

25 NO:4. 

The rhAR protein disclosed in SEQ JD N0:2 is as follows: 
MEVQLGLGRV YPRPPSKTYR GAFQNLFQSV REVIQNPGPR HPEAASAAPP 
6ASLQQQQQQ QQETSPRQQQ QQQQGEDGSP QAHRRGPTGY LVLDEEQQPS 
QPQSAPECHP ERGCVPEPGA AVAAGKGLPQ QLPAPPDEDD SAAPSTLSLL 
30 6PTFPGLSSC SADLKDILSE ASTMQLLQQQ QQEAVSEGSS SGRAREASGA 
PTSSKDNYLE GTSTISDSAK ELCKAVSVSM GLGVEALEHL SPGEQLRGDC 
MYAPVLGVPP AVRPTPCAPL AECKGSLLDD SAGKSTEDTA EYSPFKGGYT 
KGLEGESLGC SGSAAAGSSG TLELPSTLSL YKSGALDEAA AYQSRDYYNF 
PLALAGPPPP PPPPHPHARI KLENPLDYGS AWAAAAAQCR YGDLASLHGA 
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GAAGPGSGSP SAZ^V^SSSWHT LFTAEEGQLY GPCGGGGGGG GGGGGGAGEA 
GAVAPYGYTR PPQGLAGQEG DFTAPDVWYP GGMVSRVPYP SPTCVKSEMG 
PWMDSYSGPY GDMRLETARD HVLPIDYYFP PQK TCIiICGD EASGCHYGAL 
TCGSCKVFFK RAAEGKQKYIi CASRNDCTIP KFRRKNCPSC RLRKCYEAtaa 
5 TLGARKLKKL 6NLKLQEEGE ASSTTSPTEE TAQKLTVSHI EGYECQPIFL 
NVLEAIEPGV VCAGHDNNQP DSFAALLSSL NELGERQLVH WKWAKALPG 
FRNLHVDDQM AVIQYSWMGL MVFAMGWRSF TNVNSRMLYF APDLVFNEYR 
MHKSRMYSQC VRMRHLSQEF GWLQITPQEF LCMKALLLFS IIPVDGLKNQ 
KFFDELRMNY IKELDRIIAC KRKNPTSCSR RFYQLTKLLD SVQPIARELH 
10 QFTFDLLIKS HMVSVDFPEM MAEIISVQVP KILSGKVKPI YFHTQ (SEQ ID 
N0:2). 

As noted herein, the Glu-210 residue (underlined and bolded) of rhAR of SEQ 
ID N0:2 represents an allelic variant at nucleotide 1051 of SEQ ID NO:L A single 
nucleotide change at nucleotide 1051 from 'A' to 'G' results in an amino acid change 

15 at residue 210 of the rhAR, from the Glu residue of SEQ ID NO:2 to a Gly residue 
(underlined and bolded), shown below as SEQ ID N0:4: 
MEVQLGLGRV YPRPPSKTYR GAFQNLFQSV REVIQNPGPR HPEAASAAPP 
GASLQQQQQQ QQETSPRQQQ QQQQGEDGSP QAHRRGPTGY LVLDEEQQPS 
QPQSAPECHP ERGCVPEPGA AVAAGKGLPQ QLPAPPDEDD SAAPSTLSLL 

20 GPTFPGLSSC SADLKDILSE ASTMQLLQQQ QQEAVSEGSS SGRAREASGA 
PTSSKDNYLG GTSTISDSAK ELCKAVSVSM GLGVEALEHL SPGEQLRGDC 
MYAPVL6VPP AVRPTPCAPL AECKGSLLDD SAGKSTEDTA EYSPFKGGYT 
KGLEGESLGC SGSAAA6SSG TLELPSTLSL YKSGALDEAA AYQSRDYYNF 
PLALAGPPPP PPPPHPHARI KLENPLDYGS AWAAAAAQCR YGDLASLHGA 

25 GAAGPGSGSP SAAASSSWHT LFTAEEGQLY GPCGGGGGGG GGGGGGAGEA 
GAVAPYGYTR PPQGLAGQEG DFTAPDVWYP GGMVSRVPYP SPTCVKSEMG 
PWMDSYSGPY GDMRLETARD HVLPIDYYFP PQKT CLICGD EASGCHYGAL 
TCGSCKVFFK RAAEGKQKYIi CASRNDCTID KFRRKNCPSC RIiRKCYEAGM 
TLGARKIiKKL GNLKLQEEGE ASSTTSPTEE TAQKLTVSHI EGYECQPIFL 

30 NVLEAIEPGV VCAGHDIQNQP DSFAALLSSL NELGERQLVH WKWAKALPG 
FRNLHVDDQM AVIQYSWMGL MVFAMGWRSF TNVNSRMLYF APDLVFNEYR 
MHKSRMYSQC VRMRHLSQEF GWLQITPQEF LCMKALLLFS IIPVDGLKNQ 
KFFDELRMNY IKELDRIIAC KRKNPTSCSR RFYQLTKLLD SVQPIARELH 
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QFTFDLLIKS HMVSVDFPEM MAEIISVQVP KILSGKVKPI YFHTQ (SEQ ID 
N0:4) . 

The underlined portions of SEQ ID NOs:2 and 4, from amino acid residue 535 to 
residue 600, represent the DNA binding domain (DBD) of the rhAR receptor protein. 
5 The DBD participates in regulating protein-protein interactions in AR transrepression 
pathway. Aamisalo et al.. Endocrinology 140(7):3097 (1999). Transcription 
activation and repression functions of the androgen receptor are differentially 
influenced by mutations in the DNA-binding domain. In transactivation, AR forais 
homodimer and binds DNA response element via DBD. 

10 The present invention also relates to a substantially purified, fuUy 

processed (including proteolytic processing, such as processing of a natural, hybrid or 
synthetic signal sequence, glycosylation and/or phosphorylation) mature rhAR 
protein obtained from a recombinant host cell containing a DNA expression vector 
comprising a nucleotide sequence as set forth in SEQ ID NOs: 1 and 3, or nucleic 

15 acid fragments thereof as described above, such DNA expression vectors expressing 
the respective rhAR protein or rhAR precursor protein. It is especially preferred that 
the recombinant host cell be a eukaryotic host cell, including but not limited to a 
mammalian cell line or an insect cell line. In another embodiment, it is especially 
preferred that the recombinant host cell be a yeast host cell. 

20 The present invention also relates to isolated nucleic acid molecules 

which are fusion constructions expressing fusion proteins useful in assays to identify 
compounds which modulate mammalian AR. A preferred aspect of this portion of the 
invention includes, but is not limited to, glutathione S-transferase GST-rhAR fiision 
constructs. These fusion constructs include, but are not limited to, all or a portion of 

25 the ligand-binding domain of rhAR, respectively, as an in-frame fusion at the carboxy 
terminus of the GST gene. The disclosure of SEQ ID NOS:l and 3 provide the 
artisan of ordinary skill the information necessary to construct any such nucleic acid 
molecule encoding a GST-nuclear receptor fusion protein. Soluble recombinant 
GST-nuclear receptor fusion proteins may be expressed in various expression 

30 systems, including but in now manner limited to a yeast expression system (see 

Example Section 2), or Spodopterafrugiperda (Sf21) within insect cells (Invitrogen) 
using a baculovirus expression vector (e.g., Bac-N-Blue DNA from Invitrogen or 
pAcG2T from Pharmingen). Example Section 2 discloses construction of GST-Flag- 
rhARLBD (Mr = 60 kDa), which is expressed in yeast. This fusion protein is purified 
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by Standard techniques and used in a hydoxyapatite binding assay in the presence of 
labeled R1881 and unlabeled test compounds. After a parallel binding reaction where 
increasing concentration of unlabeled test compounds are incubated with 3H-R1881, 
a hydroxyapatite slurry is prepared and processed. Unbound ligand is removed and 
5 the subsequent hydroxyapatite pellet is washed and ligand bound GST-riiAR is 

assessed to quantify the amount of radioligand (3H-R1881) bound to the recombinant 
rhAR fusion protein. Results are compared to known high affinity ligands such as 5- 
alpha dihydrotestosterone and unlabeled R1881, which exhibit IC50s of ca. 1 nM. 
See, Asselin and Melancon, 1977, Steroids 30: 591-604; Ghanadian et al., 1977, 

10 Urol. Res. 5(4): 169-173. 

Other assays are contemplated for the rhAR cDNA clones of the 
present invention, including but not limited to the use of these clone(s) to set up co- 
transfection assays to measure bioactivity of compounds, or to set-up mammalian 
two-hybrid assays to test the effect of .compounds on N- and C-terminus interaction of 

15 Macaca mulatta AR. 

For example, the present invention relates to constructs wherein a 
receptor construct (e.g., containing the rhAR LBD, e.g., Gal4-rhAR-LBD) and a 
reporter construct (such as SEAP or LacZ) with regulatory sites that respond to 
increases and decreases in expression of the receptor construct. Therefore, the 

20 present invention includes assays by which modulators of rhAR are identified. 

Methods for identifying agonists and antagonists of other receptors are well known in 
the art and can be adapted to identify compounds which effect in vivo levels of rhAR. 
Accordingly, the present invention includes a method for determining whether a 
substance is a potential modulator of AR levels that comprises: 

25 (a) transf ecting or transforming cells with an expression vector 

encoding rhAR, (such as the LBD of rhAR) also known as the receptor vector; 

(b) transfecting or transforming the cells of step (a) with second 
expression vector, also known as a reporter vector, which comprises an element 
known to respond to rhAR through protein-protein interactions but bind a non-rhAR 

30 protein or a promoter fragment fused upstream of a reporter gene; 

(c) allowing the transfected cells to grow for a time sufficient for 
rhAR to be expressed; 

(d) exposing some of the transfected cells expressing rhAR, the 
"test cells'* to a test substance while not exposing control cells to the test substance; 
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(e) measuring the expression of the reporter gene in both the test 
cells and control cells. 

Of course, "controls" in such assays may take many forms, such as but 
not limited to the recitation of step (d) above, or possibly the use of cells not 
5 transfected with the nucleic acid molecule expressing rhAR (i.e., non-transfected 
cells), or cells transfected with vector alone, minus the coding region for rhAR. Also, 
conditions under which step (d) of the method is practiced are conditions that are 
typically used in the art for the study of protein-ligand interactions: e.g., physiological 
pH; salt conditions such as those represented by such commonly used bujtfers as PBS 
10 or in tissue culture media; a temperature of about 4°C to about 55°C. This assay may 
be conducted with crade cell lysate, or with more purified materials. 
Alternatively, the ttansrepression assay may be carried out as follows: 

(a) provide test cells by transfecting cells with a receptor 
expression vector that directs the expression of rhAR or a portion thereof (such as the 

15 LBD of rhAR) in the cells; 

(b) providing test cells by transfecting the cells of step (a) with a 
second reporter expression vector that directs expression of a reporter gene under 
control of a regulatory element which is responsive to rhAR via protein-protein 
interactions or a portion of the rhAR construct; 

20 (c) exposing the test cells to the substance; 

(d) measuring expression of the reporter gene; 

(e) comparing the amount of expression of the reporter gene in the 
test cells with the amount of expression of the reporter gene in control cells that have 
been transfected with a reporter vector of step (b) but not a receptor vector of step (a). 

25 This assay may be conducted with transfected manmialian cell lines 

using cell-permeable test compounds. 

An alternative assay would be one wherein multiple receptor/reporter 
constructs are transfected into cells such that the general nature of the trans-acting 
factor can be measured. It is evident that any number of variations known to one of 

30 skill in the art may be utilized in order to provide for an assay to measure the effect of 
a substance on the ability of the nuclear receptor proteins of the present invention to 
effect transcription of a promoter of interest via protein-protein interactions with 
heterologous DNA binding proteins. 
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The present invention includes additional methods for determining 
whether a substance is capable of binding to rhAR, i.e., whether the substance is a 
potential agonist or an antagonist of rhAR, where the method comprises: 

(a) providing test cells by transfecting cells with an expression 
5 vector that directs the expression of rhAR in the cells; 

(b) exposing the test cells and control cells to the substance; 

(c) measuring the amount of binding of the substance to rhAR; 

(d) comparing the amount of binding of the substance to rhAR in 
the test cells with the amount of binding of the substance to control cells that have not 

10 been transfected with rhAR or a portion thereof; wherein if the amount of binding of 
the substance is greater in the test cells as compared to the control cells, the substance 
is capable of binding to rhAR. Determining whether the substance is actually an 
agonist or antagonist can then be accomplished by the use of functional assays such 
as the transrepression assay as described above. 

15 Test compounds that regulate rhAR function through gene expression 

may be evaluated employing the method above. 

The conditions under which step (b) of the method is practiced are 
conditions that are typically used in the art for the study of protein-ligand 
interactions: e.g., physiological pH; salt conditions such as those represented by such 

20 conamonly used buffers as PBS or in tissue culture media; a temperature of about 4°C 
to about SS^'C. 

The assays described above can be carried out with cells that have 
been transiently or stably transfected with rhAR. Transfection is meant to include 
any method known in the art for introducing rhAR into the test cells. For example, 

25 transfection includes calcium phosphate or calcium chloride mediated transfection, 
lipofection, infection with a retroviral construct containing rhAR, and electroporation. 
Where binding of the substance or agonist to rhAR is measured, such binding can be 
measured by employing a labeled substance or agonist. The substance or agonist can 
be labeled in any convenient manner known to the art, e.g., radioactively, 

30 fluorescently, enzymatically. 

The rhAR of the present invention may be used to screen for rhAR 
ligands by assessing transcriptional regulation proceeding via the ligand-bound rhAR- 
transcription factor protein -protein interactions. Alternatively, the rhAR of the 
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present invention may be employed to screen for rhAR ligands using co-transfection 
with classical nuclear receptor response elements that bind the rhAR DBD. 

The present invention also relates to polyclonal and monoclonal 
antibodies raised in response to rhAR. Recombinant rhAR protein can be separated 
5 from other cellular proteins by use of an inmiunoaffinity colxmm made with 
monoclonal or polyclonal antibodies specific for full-length rhAR protein, or 
polypeptide fragments of rhAR protein. Additionally, polyclonal or monoclonal 
antibodies may be raised against a synthetic peptide (usually from about 9 to about 25 
amino acids in length) from a portion of the protein as disclosed in SEQ ID NO:2 

10 and/or SEQ ID NO:4. Monospecific antibodies to rhAR are purified from 

mammalian antisera containing antibodies reactive against rhAR or are prepared as 
monoclonal antibodies reactive with rhAR using the technique of Kohler and Milstein 
(1975, Nature 256: 495-497). Monospecific antibody as used herein is defined as a 
single antibody species or multiple antibody species with homogenous binding 

15 characteristics for rhAR. Homogenous binding as used herein refers to the ability of 
the antibody species to bind to a specific antigen or epitope, such as those associated 
with riiAR, as described above. rhAR-specific antibodies are raised by immunizing 
animals such as mice, rats, guinea pigs, rabbits, goats, horses and the like, with an 
appropriate concentration of rhAR protein or a synthetic peptide generated from a 

20 portion of ihAR with or without an immune adjuvant. 

Preinmiune smim is collected prior to the first immunization. Each 
animal receives between about 0.1 mg and about 1000 mg of rhAR protein associated 
with an acceptable inmiune adjuvant. Such acceptable adjuvants include, but are not 
limited to, Freund's complete, Freund's incomplete, alum-precipitate, water in oil 

25 emulsion containing Corynebacterium parvum and tRNA. The initial immunization 
consists of rhAR protein or peptide fragment thereof in, preferably, Freund's 
complete adjuvant at multiple sites, either subcutaneously (SC), intraperitoneally (BP) 
or both. Each animal is bled at regular intervals, preferably weekly, to determine 
antibody titer. The animals may or may not receive booster injections following the 

30 initial inmiunization. Those animals receiving booster injections are generally given 
an equal amount of rhAR in Freund's incomplete adjuvant by the same route. 
Booster injections are given at about three week intervals until maximal titers are 
obtained. At about 7 days after each booster inmiunization or about weekly after a 
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single iiBmunization, the animals are bled, the serum collected, and aliquots are 
stored at about -20°C. 

Monoclonal antibodies (mAb) reactive with rhAR are prepared by 
inraiunizing inbred mice, preferably Balb/c, with rhAR protein. The mice are 
5 immunized by the IP or SC route with about 1 mg to about 100 mg, preferably about 
10 mg, of rhAR protein in about 0.5 ml buffer or saline incorporated in an equal 
volume of an acceptable adjuvant, as discussed above. Freund's complete adjuvant is 
preferred. The mice receive an initial immunization on day 0 and are rested for about 
3 to about 30 weeks. Immunized mice are given one or more booster inmiimizations 

10 of about 1 to about 100 mg of rhAR in a buffer solution such as phosphate buffered 
saline by the intravenous (IV) route. Lymphocytes, from antibody positive mice, 
preferably splenic lymphocytes, are obtained by removing spleens from immunized 
mice by standard procedures known in the art. Hybridoma cells are produced by 
mixing the splenic lymphocytes with an appropriate fusion partner, preferably 

15 myeloma cells, under conditions that will allow the formation of stable hybridomas. 
Fusion partners may include, but are not limited to: mouse myelomas P3/NSl/Ag 4-1, 
MPC-11, S-194 and Sp 2/0, with Sp 2/0 being preferred. The antibody producing 
cells and myeloma cells are fused in polyethylene glycol, about 1000 mol. wt., at 
concentrations from about 30% to about 50%. Fused hybiidoma cells are selected by 

20 growth in hypoxanthine, thymidine and aminopterin supplemented Dulbecco's 
Modified Eagles Medium (DMEM) by procedures known in the art. Supernatant 
fluids are collected form growth positive wells on about days 14, 18, and 21 and are 
screened for antibody production by an immunoassay such as solid phase 
inmiunoradioassay (SPIRA) using rhAR as the antigen. The culture fluids are also 

25 tested in the Ouchterlony precipitation assay to determine the isotype of the mAb. 
Hybridoma cells from antibody positive wells are cloned by a technique such as the 
soft agar technique of MacPherson, 1973, Soft Agar Techniques, in Tissue Culture 
Methods and Applications^ Kmse and Paterson, Eds., Academic Press. 

Monoclonal antibodies are produced in vivo by injection of pristine 

30 primed Balb/c mice, approximately 0.5 ml per mouse, with about 2 x 106 to about 6 x 
106 hybridoma cells about 4 days after priming. Ascites fluid is collected at 
approximately 8-12 days after cell transfer and the monoclonal antibodies are purified 
by techniques known in the art. 
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In vitro production of anti-rhAR mAb is carried out by growing the 
hybiidoma in DMEM containing about 2% fetal calf serum to obtain sufficient 
quantities of the specific mAb. The mAb are purified by techniques known in the art. 

Antibody titers of ascites or hybridoma culture fluids are determined 
5 by various serological or immunological assays which include, but are not limited to, 
precipitation, passive agglutination, enzyme-linked immunosorbent antibody (ELIS A) 
technique and radioimmunoassay (RIA) techniques. Similar assays are used to detect 
the presence of human rhAR in body fluids or tissue and cell extracts. 

It is readily apparent to those skilled in the art that the above-described 

10 methods for producing monospecific antibodies may be utilized to produce antibodies 
specific for rhAR peptide fragments, or full-length rhAR. 

ihAR antibody affinity columns are made, for example, by adding the 
antibodies to Affigel-10 (Biorad), a gel support which is pre-activated with N- 
hydroxysuccinimide esters such that the antibodies form covalent linkages with the 

15 agarose gel bead support. The antibodies are then coupled to the gel via amide bonds 
with the spacer arm. The remaining activated esters are then quenched with IM 
ethanolamine HCl (pH 8.0). The column is washed with water followed by 0.23 M 
glycine HCl (pH 2.6) to remove any non-conjugated antibody or extraneous protein. 
The colrnnn is then equilibrated in phosphate buffered saline(PBS) (pH 7.3) and the 

20 cell culture supematants or cell extracts containing full-length rhAR or rhAR protein 
fragments are slowly passed through the colunm. The column is then washed with 
phosphate buffered saline until the optical density (A280) falls to background, then 
the protein is eluted with 0.23 M glycine-HCl (pH 2.6). The purified rhAR protein is 
then dialyzed against phosphate buffered saline. 

25 Levels of rhAR in host cells are quantified by a variety of techniques 

including, but not limited to, immunoaffinity and/or ligand affinity techniques. 
rhAR-specific affinity beads or rhAR-specific antibodies are used to isolate 35s- 
methionine labeled or unlabelled rhAR. Labeled rhAR protein is analyzed by SDS- 
PAGE. Unlabelled rhAR protein is detected by Western blotting, ELISA or RIA 

30 assays employing either rhAR protein specific antibodies and/or antiphosphotyrosine 
antibodies. 

Following expression of rhAR in a host cell, rhAR protein may be 
recovered to provide rhAR protein in active form. Several rhAR protein purification 
procedures are available and suitable for use. Recombinant rhAR protein may be 
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purified from cell lysates and extracts, or from conditioned culture medium, by 
various combinations of, or individual application of salt fractionation, ion exchange 
chromatography, size exclusion chromatography, hydroxylapatite adsorption 
chromatography and hydrophobic interaction chromatography. 
5 The DNA molecules, RNA molecules, recombinant protein and 

antibodies of the present invention may be used to screen and measure levels of 
rhAR. The recombinant proteins, DNA molecules, RNA molecules and antibodies 
lend themselves to the formulation of kits suitable for the detection and typing of 
rhAR. Such a kit would comprise a compartmentalized carrier suitable to hold in 

10 close confinement at least one container. The carrier would further comprise reagents 
such as recombinant rhAR or anti-rhAR antibodies suitable for detecting rhAR. The 
carrier may also contain a means for detection such as labeled antigen or enzyme 
substrates or the like. 

Pharmaceutically useftil compositions comprising modulators of rhAR 

15 may be formulated according to known methods such as by the admixture of a 
pharmaceutically acceptable carrier. Examples of such carriers and methods of 
formulation may be found in Remington's Pharmaceutical Sciences. To form a 
pharmaceutically acceptable composition suitable for effective administration, such 
compositions will contain an effective amount of the protein, DNA, RNA, modified 

20 rhAR, or either rhAR agonists or antagonists. 

Therapeutic or diagnostic compositions comprising modulators of 
rhAR are administered to an individual in amounts sufficient to treat or diagnose 
disorders. The effective amount may vary according to a variety of factors such as 
the individual's condition, weight, sex and age. Other factors include the mode of 

25 administration. 

The pharmaceutical compositions may be provided to the individual 
by a variety of routes such as subcutaneous, topical, oral and intramuscular. 

The term "chemical derivative" describes a molecule that contains 
additional chemical moieties that are not normally a part of the base molecule. Such 
30 moieties may improve the solubility, half-life, absorption, etc. of the base molecule. 
Alternatively the moieties may attenuate undesirable side effects of the base molecule 
or decrease the toxicity of the base molecule. Examples of such moieties are 
described in a variety of texts, such as Remington's Pharmaceutical Sciences. 
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Compounds identified according to the methods disclosed herein may 
be used alone at appropriate dosages. Alternatively, co-administration or sequential 
administration of other agents may be desirable. 

The present invention also has the objective of providing suitable 
5 topical, oral, systemic and parenteral pharmaceutical formulations for use in the novel 
methods of treatment of the present invention. The compositions containing 
compounds identified according to this invention as the active ingredient can be 
administered in a wide variety of therapeutic dosage forms in conventional vehicles 
for administration. For example, the compounds can be administered in such oral 

10 dosage forms as tablets, capsules (each including timed release and sustained release 
formulations), pills, powders, granules, elixirs, tinctures, solutions, suspensions, 
syrups and emulsions, or by injection. Likewise, they may also be administered in 
intravenous (both bolus and infusion), intraperitoneal, subcutaneous, topical with or 
without occlusion, or intramuscular form, all using forms well known to those of . 

15 ordinary skill in the pharmaceutical arts. 

Advantageously, compounds of the present invention may be 
adnndnistered in a single daily dose, or the total daily dosage may be administered in 
divided doses of two, three or four times daily. Furthermore, compounds for the 
present invention can be administered in intranasal form via topical use of suitable 

20 intranasal vehicles, or via transdermal routes, using those forms of transdermal skin 
patches well known to those of ordinary skill in that art. To be administered in the 
form of a transdermal delivery system, the dosage administration will, of course, be 
continuous rather than intermittent throughout the dosage regimen. 

For combination treatment with more than one active agent, where the 

25 active agents are in separate dosage formulations, the active agents can be 

administered concurrently, or they each can be administered at separately staggered 
times. 

The dosage regimen utilizing the compounds of the present invention 
is selected in accordance with a variety of factors including type, species, age, weight, 
30 sex and medical condition of the patient; the severity of the condition to be treated; 
the route of administration; the renal, hepatic and cardiovascular function of the 
patient; and the particular compound thereof employed. A physician or veterinarian 
of ordinary skill can readily determine and prescribe the effective amount of the drug 
required to prevent, counter or arrest the progress of the condition. Optimal precision 
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in achieving concentrations of drug within the range that yields efficacy without 
toxicity requires a regimen based on the kinetics of the drugs availability to target 
sites. This involves a consideration of the distribution, equilibrium, and elimination 
of a drug. 

5 The following examples are provided to illustrate the present invention 

without, however, hmiting the same hereto. 

EXAMPLE 1: 

Isolation and Characterization of a DNA Molecule Encoding rhAR 

10 

The DNA sequence for Macaca fascicularis monkey AR (Gen Bank 
Acc. # U94179, also disclosed in the attached sequence listing as SEQ TD N0:6) and 
an EST fox Macaca mulatta AR (Gen Bank Accesssion No. AF092930) may be used 
for primer designing. The nucleotide sequence for Macaca mulatta AR EST is as 



follows: 








TCTCAAGAGT 


TTGGATGGCT 


CCAAATCACC 


CCCCAGGAAT TCCTGTGCAT 


GAAAGCGCTG 


CTACTCTTCA 


GCATTATTCC 


AGTGGATGGG CTSAAAAATC 


AAAAATTCTT 


TGATGAACTT 


CGAATGAACT 


ACATCAAGGA ACTCGATCGT 


ATCATTGCAT 


GCAAAA6AAA 


AAATCCCACA 


TCCTGCTCAA GGCGTTTCTA 


CCAGCTCACC 


AAGCTCCTGG 


ACTCCGTGCA 


GCCTATTGCG AGAGAGCTGC 


ATCAGTTCAC 


TTTTGACCTG 


CTAATCAAGT 


CACACATGGT GAGCGTGGAC 


TTTCCGGAAA 


TGATGGCAGA 


GATCATCTC 


(SEQ ID N0:7) . 




Messenger RNA from rhesus monkey prostate was prepared and 



cDNA was synthesized by standard methods. The full-length Macaca mulatta AR 



25 was cloned via standard PGR methodology. Oligonucleotide primers were based on 
Macaca fascicularis AR. Template cDNA was synthesized from Macaca mulatta 
prostate mRNA. Primer pairs mkARF2 (5 -ATG GAG GTG GAG TTA GGG CTG- 
3'; SEQ ID NO:8) and mkARRS (5'-GGT CTT CTG GGG TGG AAA GTA-3'; SEQ 
ID N0:9) were used to obtain the NH2-terminal portion of the gene via PGR, while 

30 the COOH-terminal portion was obtained using mkARFS (5 -ACG GOT ACA CTC 
GGC GAG CTC-3'; SEQ ID NO: 10) and mkARR2 (5'-AAC AGG GAG AAG ACA 
TCT GAA-3'SEQ ID NO: 11). Each fragment was sub-cloned into a pCRU vector 
and sequencing verification was performed on DNA from each sub-clones. Clones 
containing wild type cDNA sequences as compared to the consensus sequence from 
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both NH2- and COOH- terminal DNA sequence assembly were used for full-length 
cDNA construction. The final full-length cDNA was obtained through ligating the 5' 
and the 3' end of the cDNA at a Kpnl site and cloning into a pCRII vector. The 
nucleotide sequence was again verified via sequencing. Also, the starting Met and 5*- 
5 UTR information for Macaca mulatta AR was obtained through cDNA extension on 
subdivided Macaca mulatta cDNA library using rnkARR? primer (5 -GGC GGC 
CGA GOG TAG ACC CTC-3' SEQ ID NO: 12). The cloned Macaca mulatta AR 
cDNA shows seven nucleotide differences from Macaca fascicularis AR in the 
coding region which result in two amino acid residues differences. Both open reading 
10 frames show identical polyQ and polyG sequences which are shorter than the human 
version, with the DBD and LBD regions being identical to the human version. 

EXAMPLE2 

Generation of GST-rhAR Fusion Proteins for Use in In Vitro Screening Assays 

15 

Expression vector construction: PGR firagment containing residues 601 to 
895, which contains the whole LBD, was inserted into pESP-1 expression vector 
(#251600, Stratagene, Lo JoUa, CA) at Smal site which makes the rhARLBD down 
stream of GST-Flag tag. The final conjunction sequences are vector 5'-GGA TCC 
20 CCC ACT CTG GGA GCC CTG OCT GTT GGG TAA-3' vector. 

AR Expression - GST-Flag-rhARUBD (Mr = 60 kDa) is expressed in yeast 
using pESP-l vector according to Stratagene' s protocol and lysed in TEGM/DTT/PI 
buffer [10 mM Tris, pH7.4, 1 mM EDTA, 10% glycerol, 10 mM molybdate, 2 mM 
DTT, 50 ul of yeast protease inhibitor cocktail (PI: Sigma) per gram of yeast and 1/10 
25 vol. of PI complete (PI: Boehringer-Mannheim) per gram of yeast. 

Fusion Protein Purification - The above fusion protein is purified using anti- 
flag M2 affinity gel (Sigma) via batch purification method using TEGM/DTT buffer. 
The protein is eluted using TEGM/DTT buffer containing 100 ug/ml of Flag peptide. 

Hydroxyapatite Binding Assay - Typically, 0.25 ug/ml of recombinant 
30 purified GST-Flag-rhARLBD and 2 nM 3H-R1881 are combined in 100 ul binding 
reaction (with 50 mM Tris, pH7.5, 10% glycerol, 0.8 M NaCl, 1 mg/ml BSA and 2 
mM dithiothreitol) that is incubated for 18 hours at 4 ""C. 3H-R1881 binding 
displacement is assessed in parallel binding reaction aliquots in the presence of 
varying concentrations of unlabeled test compounds. Following the initial 18 hour 
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binding reaction, 100 ul of a 50% (wt/vol) hydroxyapatite (HAP) slurry is added to 
each sample, vortexed, and incubated on ice for 10 min. The samples are then 
centrifuged and the supernatant aspirated to remove unbound ligand. The HAP pellet 
is washed three times with wash buffer (40 mM Tris, pH7.5, 100 mM KCl, 1 mM 
5 EDTA and 1 mM EGTA). The 3x washed HAP pellet containing ligand-bound GST- 
RhAR is transferred in 95% EtOH to a scintillation vial containing 5 ml scintillation 
fluid, mixed and counted to quantify the amount of radioligand (3H-R1881) bound to 
the recombinant RhAR fusion protein. Results are compared to known high affinity 
ligands such as 5-alpha dihydrotestosterone and unlabeled R1881, which exhibit 
10 IC50sof ca. InM. 

While the foregoing specification teaches the principles of the present 
invention, with examples provided for the purpose of illustration, it will be 
understood that the practice of the invention encompasses all of the usual variations, 
15 adoptions, or modifications, as come within the scope of the following claims and 
their equivalents. 
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WHAT IS CLAIMED: 

1 . A puiiiied DNA molecule encoding a Macaca mulatta AR 
protein wherein said protein comprises the amino acid sequence as follows: 
MEVQLGLGRV YPRPPSKTYR GAFQNLFQSV REVTQNPGPR HPEAASAAPP 
GASLQQQQQQ QQETSPRQQQ QQQQGEDGSP QAHRRGPTGY LVLDEEQQPS 
QPQSAPECHP ERGCVPEPGA AVAAGKGLPQ QLPAPPDEDD SAAPSTLSLL 
GPTFPGLSSC SADLKDILSE ASTMQLLQQQ QQEAVSEGSS SGRAREASGA 
PTSSKDNYLE GTSTISDSAK ELCKAVSVSM GLGVEALEHL SPGEQLRGDC 
KWAPVLGVPP AVRPTPCAPL AECKGSLLDD SAGKSTEDTA EYSPFKGGYT 
KGLEGESLGC SGSAAAGSSG TLELPSTLSL YKSGALDEAA AYQSRDYYNF 
PLALAGPPPP PPPPHPHARI KLENPLDYGS AWAAAAAQCR YGDLASLHGA 
GAAGPGSGSP SAAASSSWHT LFTAEEGQLY GPCGGGGGGG GGGGGGA6EA 
GAVAPYGYTR PPQGLAGQEG DFTAPDVWYP GGMVSRVPYP SPTCVKSEMG 
PWMDSYSGPY GDMRLETARD HVLPIDYYFP PQKTCLICGD EASGCHY6AL 
TCGSCKVFFK RAAEGKQKYL CASRNDCTID KFRRKNCPSC RLRKCYEAGM 
TLGARKLKKL GNLKLQEEGE ASSTTSPTEE TAQKLTVSHI EGYECQPIFL 
NVLEAIEPGV VCAGHDNNQP DSFAALLSSL NELGERQLVH WKWAKALP6 
FRIJLHVDDQM AVIQYSWMGL MVFAMGWRSF TNVNSRMLYF APDLVFNEYR 
MHKSRMYSQC VRMRHLSQEF GWLQITPQEF LCMKALLLFS IIPVDGLKNQ 
KFFDELRMNY IKELDRIIAC KRKNPTSCSR RFYQLTKLLD SVQPIARELH 
QFTFDLLIKS HMVSVDFPEM MAEIISVQVP KILSGKVKPI YFHTQ, as set forth in 
three-letter abbreviation in SEQ ID N0:2. 

25 2. A DNA expression vector for expressing a Macaca mulatta AR 

protein in a recombinant host cell wherein said expression vector comprises a DNA 
molecule of Claim 1. 

3. A host cell which expresses a recombinant Macaca mulatta AR 
30 protein wherein said host cell contains the DNA expression vector of Claim 2. 

4. A process for expressing a Macaca mulatta AR protein in a 
recombinant host cell, comprising: 
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(a) transfecting the expression vector of Claim 2 into a suitable 

host cell; and 

(b) culturing the host cells of step (a) under conditions which 
allow expression of said the Macaca mulatta AR protein from said DNA expression 

5 vector. 

5. A purified DNA molecule encoding a Macaca mulatta AR 
protein wherein said protein consists of the amino acid sequence as follows: 





MEVQIiGLGRV 


YPRPPSKTYR 


GAFQNLFQSV 


REVTQNPGPR 


HPEAASAAPP 


10 


GASLQQQQQQ 


QQETSPRQQQ 


QQQQGED6SP 


QAHRRGPTGY 


LVLDEEQQPS 




QPQSAPECHP 


ERGCVPEPGA 


AVAAGKGLPQ 


QLPAPPDEDD 


SAAPSTLSLL 




GPTFPGLSSC 


SADLKDILSE 


ASTMQLLQQQ 


QQEAVSEGSS 


SGRAEEASGA 




PTSSKDNYLE 


GTSTISDSAK 


ELCKAVSVSM 


GL6VEALEHL 


SPGEQLRGDC 




MYAPVLGVPP 


AVRPTPCAPL 


AECKGSLLDD 


SAGKSTEDTA 


EYSPFKGGYT 


15 


KGLEGESLGC 


SGSAAAGSSG 


TLELPSTLSL 


YKSGALDEAA 


AYQSRDYYNF 




PLALAGPPPP 


PPPPHPHARI 


KLENPLDYGS 


AWAAAAAQCR 


YGDLASLHGA 




GAAGPGSGSP 


SAAASSSWHT 


LFTAEEGQLY 


GPCGGGGGGG 


GGGGGGAGEA 




GAVAPYGYTR 


PPQGLAGQEG 


DFTAPDVWYP 


GGMVSRVPYP 


SPTCVKSEMG 




PWMDSYSGPY 


GDMRLETARD 


HVLPIDYYFP 


PQKTCLICGD 


EASGCHY6AL 


20 


TCGSCKVFFK 


RAAEGKQKYL 


CASFNDCTID 


KFRRKNCPSC 


RLRKCYEAOM 




TLGARKLKKL 


GNLKLQEEGE 


ASSTTSPTEE 


TAQKLTVSHI 


EGYECQPIFL 




NVLEAIEPGV 


VCAGHDNNQP 


DSFAALLSSL 


NELGERQLVH 


WKWAKALPG 




FPNLHVDDQM 


AVIQYSWMGL 


MVFAMGWRSF 


TNVNSRMLYF 


APDLVFNEYR 




MHKSHMYSQC 


VRMRHLSQEF 


GWLQITPQEF 


LCMKALLLFS 


IIPVDGLKNQ 


25 


KFFDELEMNY 


IKELDRIIAC 


KRKNPTSCSR 


RFYQLTKLLD 


SVQPIARELH 




QFTFDLLIKS 


HMVSVDFPEM 


MAEIISVQVP 


KILSGKVKPI 


YFHTQ , as set forth in 



three-letter abbreviation in SEQ ID N0:2. 



6. A DNA expression vector for expressing a Macaca mulatta AR 
30 protein in a recombinant host cell wherein said expression vector comprises a DNA 

molecule of Claim 5. 

7. A host cell which expresses a recombinant Macaca mulatta AR 
protein wherein said host cell contains the expression vector of Claim 6. 
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8. A process for expressing a Macaca mulatta AR protein in a 
recombinant host cell, comprising: 

(a) transfecting the expression vector of Claim 6 into a suitable 

5 host cell; and 

(b) culturing the host cells of step (a) under conditions which 
allow expression of said the Macaca mulatta AR protein from said expression vector. 

9. A purified DNA molecule encoding a Macaca mulatta AR 
10 protein wherein said DNA molecule comprises the nucleotide sequence, as follows: 

CCCAAAAAAT AAAAACAAAC AAAAACAAAA CAAAACAAAA AAAACGAATA 
AAGAAAAAGG TAATAACTCA GTTCTTATTT GCACCTACTT CCA6TGGACA 
CTGAATTTGG AAGGTGGAGG ATTCTTGTTT TTTCTTTTAA GATCGGGCAT 
CTTTTGAATC TACCCCTCAA GTGTTAAGAG ACAGACTGT6 AGCCTAGCAG 

15 GGCAGATCTT GTCCACCGTG TGTCTTCTTT TGCAGGAGAC TTTGAGGCTG 
TCAGAGCGCT TTTTGCGTGG TTGCTCCCGC AAGTTTCCTT CTCTGGAGCT 
TCCCGCAGGT GGGCAGCTAG CTGCAGCGAC TACCGCATCA TCACAGCCT6 
TTGAACTCTT CTGAGCAAGA GAAGGGGAGG CGGGGTAAGG GAAGTAGGTG 
GAAGATTCAG CCAAGCTCAA GGATGGAGGT GCAGTTAGGG CTGGGGAGGG 

20 TCTACCCTCG GCCGCCGTCC AAGACCTACC GAGGAGCTTT CCAGAATCTG 
TTCCAGAGCG TGCGCGAAGT GATCCAGAAC CCGGGCCCCA GGCACCCA6A 
GGCCGCGAGC GCAGCACCTC CCGGCGCCAG TTTGCAGCAG CAGCAGCAGC 
AGCAGCAAGA AACTAGCCCC CGGCAACAGC AGCAGCAGCA GCAGGGTGAG 
GATGGTTCTC CCCAAGCCCA TCGTAGAGGC CCCACAGGCT ACCTGGTCCT 

25 GGATGAGGAA CAGCAGCCTT CACAGCCTCA GTCAGCCCCG GAGTGCCACC 
CCGAGAGAGG TTGCGTCCCA GAGCCTGGAG CCGCCGTGGC CGCCGGCAAG 
GGGCTGCCGC AGCAGCTGCC AGCACCTCCG GACGAGGATG ACTCAGCTGC 
CCCATCCACG TTGTCTCTGC TGGGCCCCAC TTTCCCCGGC TTAAGCAGCT 
GCTCCGCCGA CCTTAAAGAC ATCCTGAGCG AGGCCAGCAC CATGCAACTC 

30 CTTCAGCAAC AGCAGCAGGA AGCAGTATCC GAAGGCAGCA GCAGCGGGAG 
AGCGAGGGAG GCCTCGGGGG CTCCCACTTC CTCCAAGGAC AATTACTTAG 
AGGGCACTTC GACCATTTCT GACAGCGCCA AGGAGCTGTG TAAGGCAGTG 
TCGGTGTCCA TGGGCTTGGG TGTGGAGGCG TTGGAGCATC TGAGTCCAGG 
GGAACAGCTT CGGGGGGATT GCATGTACGC CCCAGTTTTG GGAGTTCCAC 
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CCGCTGTGCG 


TCCCACTCCG 


TGTGCCCCAT 


TGGCCGAATG 


CAAAGGTTCT 




CTGCTAGACG 


ACAGCGCAGG 


CAAGAGCACT 


GAAGATACTG 


CTGAGTATTC 




CCCTTTCAAG 


GGAGGTTACA 


CCAAAGGGCT 


AGAAGGCGAG 


AGCCTAGGCT 




GCTCTGGCAG 


CGCTGCAGCA 


GGGAGCTCCG 


GGACACTTGA 


ACTGCCGTCC 


5 


ACCCTGTCTC 


TCTACAAGTC 


CGGAGCACTG 


GACGAGGCAG 


CTGCGTACCA 




GAGTCGCGAC 


TACTACAACT 


TTCCACTGGC 


TCTGGCCGGG 


CCGCCGCCCC 




CTCCACCGCC 


TCCCCATCCC 


CACGCTCGCA 


TCAAGCTGGA 


GAACCCGCTG 




GACTATGGCA 


GCGCCTGGGC 


GGCTGCGGCG 


6CGCAGTGCC 


GCTATGGGGA 




CCTGGCGAGC 


CTGCATGGCG 


CGGGTGCAGC 


GGGACCCGGC 


TCTGGGTCAC 


10 


CCTCAGCGGC 


CGCTTCCTCA 


TCCTGGCACA 


CTCTCTTCAC 


AGCCGAAGAA 




GGCCAGTTGT 


ATG6ACCGTG 


TGGTGGTGGG 


GGCGGCGGCG 


GTGGCGGCGG 




CG6CGGCGGC 


GCAGGCGAGG 


CGGGAGCTGT 


AGCCCCCTAC 


GGCTACACTC 




GGCCACCTCA 


GGGGCTGGCG 


GGCCAGGAAG 


GCGACTTCAC 


CGCACCTGAT 




GTGTGGTACC 


CTGGCGGCAT 


GGTGAGCAGA 


GTGCCCTATC 


CCAGTCCCAC 


15 


TTGTGTCAAA 


AGCGAGATGG 


GCCCCTG6AT 


GGATAGCTAC 


TCCGGACCTT 




ACGGGGACAT 


GCGTTTGGAG 


ACTGCCAGGG 


ACCATGTTTT 


GCCAATTGAC 




TATTACTTTC 


CACCCCAGAA 


GACCTGCCTG 


ATCTGTGGAG 


ATGAAGCTTC 




TGGGTGTCAC 


TATGGAGCTC 


TCACATGTGG 


AAGCTGCAAG 


GTCTTCTTCA 




AAAGAGCCGC 


TGAAGGGAAA 


CAGAAGTACC 


TGTGTGCCAG 


CAGAAATGAT 


20 


TGCACTATTG 


ATAAATTCCG 


AAGGAAAAAT 


TGTCCATCTT 


GCCGTCTTCG 




GAAATGTTAT 


6AAGCAGGGA 


TGACTCTGGG 


AGCCCGGAAG 


CTGAAGAAAC 




TTGGTAATCT 


GAAACTACAG 


GAGGAAGGAG 


AGGCTTCCAG 


CACCACCAGC 




CCCACTGAGG 


AGACAGCCCA 


GAAGCTGACA 


GTGTCACACA 


TTGAAGGCTA 




TGAATGTCAG 


CCCATCTTTC 


TGAAT6TCCT 


GGAGGCCATT 


GAGCCAGGTG 


25 


TGGTGTGTGC 


TGGACATGAC 


AACAACCAGC 


CCGACTCCTT 


CGCAGCCTTG 




CTCTCTAGCC 


TCAATGAACT 


GGGAGAGAGA 


CAGCTTGTAC 


ATGTGGTCAA 




GTGGGCCAAG 


GCCTTGCCTG 


GCTTCCGCAA 


CTTACACGTG 


GACGACCAGA 




TGGCTGTCAT 


TCAGTACTCC 


TGGATGGGGC 


TCATGGTGTT 


TGCCATGGGC 




TG6CGATCCT 


TCACCAATGT 


CAACTCCAGG 


ATGCTCTACT 


TTGCCCCTGA 


30 


TCTGGTTTTC 


AATGAGTACC 


GCATGCACAA 


ATCCCGGATG 


TACAGCCAGT 




6TGTCCGAAT 


6AGGCACCTC 


TCTCAA6AGT 


TTGGATGGCT 


CCAAATCACC 




CCCCAGGAAT 


TCCTGTGCAT 


GAAAGCGCT6 


CTACTCTTCA 


GCATTATTCC 




AGTGGATGGG 


CTGAAAAATC 


AAAAATTCTT 


TGATGAACTT 


CGAATGAACT 




ACATCAAGGA 


ACTCGATCGT 


ATCATTGCAT 


GCAAAAGAAA 


AAATCCCACA 
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10 



TCCTGCTCAA GGCGTTTCTA CCAGCTCACC AAGCTCCTGG ACTCCGTGCA 
GCCTATTGCG AGAGAGCTGC ATCAGTTCAC TTTTGACCTG CTAATCAAGT 
CACACATGGT GAGCGTGGAC TTTCCGGAAA TGATGGCAGA GATCATCTCT 
GTGCAAGTGC CCAAGATCCT TTCTGG6AAA GTCAAGCCCA TCTATTTCCA 
CACCCAGTGA AGCATTGGAA ATCCCTATTT CCTCACCCCA GCTCATGCCC 
CCTTTCAGAT GTCTTCTGCC TGTTA, set forth as SEQ ID N0:1. 

10. A DNA molecule of Claim 9 which consists of nucleotide 154 
to about nucleotide 1257 of SEQ ID NO: 1. 

11. An expression vector for expressing a Macaca mulatta AR 
protein wherein said expression vector comprises a DNA molecule of Claim 9. 



12. An expression vector for expressing a Macaca mulatta AR 
15 protein wherein said expression vector comprises a DNA molecule of Claim 10. 

13. A host cell which expresses a recombinant Macaca mulatta AR 
protein wherein said host cell contains the expression vector of Claim 11. 

20 14. A host cell which expresses a recombinant Macaca mulatta AR 

protein wherein said host cell contains the expression vector of Claim 12. 

15. A process for expressing a Macaca mulatta AR protein in a 
recombinant host cell, comprising: 
25 (a) transfecting the expression vector of Claim 11 into a 

suitable host cell; and, 

(b) culturing the host cells of step (a) under conditions which 
allow expression of said the Macaca mulatta AR protein from said expression vector. 

30 16. The process of Claim 15 wherein the host cell is a yeast host 

cell. 

17. A purified DNA molecule encoding a Macaca mulatta AR 
protein wherein said DNA molecule consists of the nucleotide sequence, as follows, 
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CCCAAAAAAT 


AAAAACAAAC 


AAAAACAAAA 


CAAAACAAAA 


AAAACGAATA 




AA6AAAAAG6 


TAATAACTCA 


GTTCTTATTT 


GCACCTACTT 


CCAGTGGACA 




CTGAATTTGG 


AAGGTGGAGG 


ATTCTTGTTT 


TTTCTTTTAA 


GATCGGGCAT 




CTTTTGAATC 


TACCCCTCAA 


GTGTTAAGAG 


ACAGACTGTG 


AGCCTAGCAG 


5 


GGCAGATCTT 


GTCCACCGTG 


TGTCTTCTTT 


TGCAGGAGAC 


TTTGAGGCTG 




TCAGAGCGCT 


TTTTGCGTGG 


TTGCTCCCGC 


AAGTTTCCTT 


CTCTGGAGCT 




TCCCGCAGGT 


GGGCAGCTAG 


CTGCAGCGAC 


TACCGCATCA 


TCACAGCCTG 




TTGAACTCTT 


CTGAGCAAGA 


GAAGGGGAGG 


CGGGGTAAGG 


GAAGTAGGTG 




GAAGATTCAG 


CCAAGCTCAA 


GGATGGAGGT 


GCAGTTAGGG 


CTGGGGAGGG 


10 


TCTACCCTCG 


GCCGCCGTCC 


AAGACCTACC 


GAGGAGCTTT 


CCAGAATCT6 




TTCCAGAGCG 


TGCGCGAAGT 


GATCCAGAAC 


CCGGGCCCCA 


GGCACCCAGA 




GGCCGCGAGC 


GCAGCACCTC 


CCGGCGCCAG 


TTTGCAGCAG 


CAGCAGCAGC 




AGCAGCAAGA 


AACTAGCCCC 


CGGCAACAGC 


AGCAGCAGGA 


GCAGGGTGA6 




GATGGTTCTC 


CCCAAGCCCA 


TCGTA6AGGC 


CCCACAGGCT 


ACCTGGTCCT 


15 


GGATGAGGAA 


CAGCAGCCTT 


CACAGCCTCA 


GTCAGCCCCG 


GAGTGCCACC 




CCGAGAGAGG 


TTGCGTCCCA 


GAGCCTGGAG 


CCGCCGTGGC 


CGCCGGCAAG 




GGGCT6CCGC 


AGCAGCTGCC 


AGCACCTCCG 


GACGAGGATG 


ACTCAGCTGC 




CCCATCCACG 


TTGTCTCTGC 


TGGGCCCCAC 


TTTCCCCGGC 


TTAAGCAGCT 




GCTCCGCCGA 


CCTTAAAGAC 


ATCCTGAGCG 


AGGCCAGCAC 


CATGCAACTC 


20 


CTTCAGCAAC 


AGCAGCAGGA 


AGCA6TATCC 


GAAGGCAGCA 


GCAGCGGGAG 




AGCGAGGGAG 


GCCTCGGGGG 


CTCCCACTTC 


CTCCAAGGAC 


AATTACTTAG 




AGGGCACTTC 


GACCATTTCT 


GACA6CGCCA 


AGGAGCTGTG 


TAAGGCAGTG 




TCGGTGTCCA 


TGGGCTTGGG 


TGTGGAGGCG 


TTGGAGCATC 


TGAGTCCAGG 




GGAACAGCTT 


CGGGGGGATT 


GCATGTACGC 


CCCAGTTTTG 


GGAGTTCCAC 


25 


CCGCTGTGCG 


TCCCACTCCG 


TGTGCCCCAT 


TGGCCGAATG 


CAAAGGTTCT 




CTGCTAGACG 


ACAGCGCAGG 


CAAGAGCACT 


GAAGATACTG 


CTGAGTATTC 




CCCTTTCAAG 


GGAGGTTACA 


CCAAAGGGCT 


AGAAGGCGAG 


AGCCTAGGCT 




GCTCTGGCAG 


CGCTGCAGCA 


GGGAGCTCCG 


GGACACTTGA 


ACTGCCGTCC 




ACCCTGTCTC 


TCTACAAGTC 


CGGAGCACTG 


GACGAGGCAG 


CTGCGTACCA 


30 


GAGTCGCGAC 


TACTACAACT 


TTCCACTGGC 


TCTGGCCGGG 


CCGCCGCCCC 




CTCCACCGCC 


TCCCCATCCC 


CACGCTCGCA 


TCAAGCTGGA 


GAACCCGCTG 




GACTATGGCA 


GCGCCTGGGC 


GGCTGCGGCG 


GCGCAGTGCC 


GCTATGGGGA 




CCTGGC6AGC 


CTGCATGGCG 


CGGGTGCAGC 


GGGACCCGGC 


TCTGGGTCAC 




CCTCAGCGGC 


CGCTTCCTCA 


TCCTGGCACA 


CTCTCTTCAC 


AGCCGAAGAA 
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GGCCAGTTGT ATGGACCGTG TGGTGGTGGG GGCGGCGGCG GTGGCGGCGG 
CGGCGGCGGC GCAGGC6AGG CGGGAGCTGT AGCCCCCTAC GGCTACACTC 
GGCCACCTCA GGGGCTGGCG GGCCAGGAA6 GCGACTTCAC CGCACCTGAT 
GTGTGGTACC CTGGCGGCAT GGTGAGCAGA GTGCCCTATC CCAGTCCCAC 
5 TTGTGTCAAA AGCGAGATGG GCCCCTGGAT GGATAGCTAC TCCGGACCTT 
ACGGGGACAT GCGTTTGGAG ACTGCCAGGG ACCATGTTTT GCCAATTGAC 
TATTACTTTC CACCCCAGAA GACCTGCCTG ATCTGTGGAG ATGAAGCTTC 
TGGGTGTCAC TATGGAGCTC TCACATGTGG AAGCTGCAAG GTCTTCTTCA 
AAAGAGCCGC TGAAGGGAAA CAGAAGTACC TGTGTGCCAG CAGAAATGAT 

10 TGCACTATTG ATAAATTCCG AAGGAAAAAT TGTCCATCTT GCCGTCTTCG 
GAAATGTTAT GAAGCAGGGA TGACTCTGGG AGCCCGGAAG CTGAAGAAAC 
TTGGTAATCT GAAACTACAG GAGGAAGGAG AGGCTTCCAG CACCACCAGC 
CCCACTGAGG AGACAGCCCA GAAGCTGACA GTGTCACACA TTGAAG6CTA 
TGAATGTCAG CCCATCTTTC TGAATGTCCT GGAGGCCATT GAGCCAG6TG 

15 TGGTGTGTGC TGGACATGAC AACAACCAGC CCGACTCCTT CGCAGCCTTG 
CTCTCTAGCC TCAATGAACT GGGAGAGAGA CAGCTTGTAC ATGTGGTCAA 
GTGGGCCAAG GCCTTGCCTG GCTTCCGCAA CTTACACGTG GACGACCAGA 
TGGCTGTCAT TCAGTACTCC TGGATGGGGC TCATGGTGTT TGCCATGGGC 
TGGCGATCCT TCACCAATGT CAACTCCAGG ATGCTCTACT TTGCCCCTGA 

20 TCTGGTTTTC AAT6AGTACC GCATGCACAA ATCCCGGATG TACAGCCAGT 
GTGTCCGAAT GAGGCACCTC TCTCAAGAGT TTGGATGGCT CCAAATCACC 
CCCCAGGAAT TCCTGTGCAT GAAAGCGCTG CTACTCTTCA GCATTATTCC 
AGTGGATGGG CTGl^^AAATC AAAAATTCTT TGATGAACTT CGAATGAACT 
ACATCAAGGA ACTCGATCGT ATCATTGCAT GCAAAAGAAA AAATCCCACA 

25 TCCTGCTCAA GGCGTTTCTA CCAGCTCACC AAGCTCCTGG ACTCCGTGCA 
GCCTATTGCG AGAGAGCTGC ATCAGTTCAC TTTTGACCTG CTAATCAAGT 
CACACATGGT GAGCGTGGAC TTTCCGGAAA TGATGGCAGA GATCATCTCT 
GTGCAAGTGC CCAAGATCCT TTCTGGGAAA GTCAAGCCCA TCTATTTCCA 
CACCCAGTGA AGCATTGGAA ATCCCTATTT CCTCACCCCA GCTCATGCCC 

30 CCTTTCAGAT GTCTTCTGCC TGTTA, as set forth in SEQ ID NO: 1. 

18. A DNA molecule of Claim 17 which consists of nucleotide 423 
to about nucleotide 3108 of SEQ ID NO: 1. 
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19. A DNA expression vector for expressing a Macaca mulatta AR 
protein wherein said expression vector comprises a DNA molecule of Claim 17. 

20. A DNA expression vector for expressing a Macaca mulatta 

5 AR protein wherein said expression vector comprises a DNA molecule of Claim 18. 

21. A host cell which expresses a recombinant Macaca mulatta AR 
protein wherein said host cell contains the expression vector of Claim 19. 

10 22. A host cell which expresses a recombinant Macaca mulatta AR 

protein wherein said host cell contains the expression vector of Claim 20. 

23. A process for expressing a Macaca mulatta AR protein in a 
recombinant host cell, comprising: 
15 (a) transfecting the expression vector of Claim 19 into a 

suitable host cell; and 

(b) culturing the host cells of step (a) under conditions which 
allow expression of said the Macaca mulatta AR protein from said expression vector. 

20 24. The process of Claim 23 wherein the host cell is a yeast host 

cell. 

25. A purified Macaca mulatta AR protein which comprises the 
amino acid sequence as set forth in SEQ ID NO: 2. 

25 

26. A purified Macaca mulatta AR protein which consists of the 
amino acid sequence as set forth in SEQ ID NO: 2. 

27. A purified Macaca mulatta AR protein derived from a host cell 
30 transfected with a DNA expression vector which comprises the nucleotide sequence 

as set forth in SEQ ID N0:1. 
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28. A purified Macaca mulatta AR protein of Claim 27 wherein 
said DNA expression vector contains from about nucleotide 423 to about nucleotide 
3108ofSEQIDNO:l. 

29. A purified DNA molecule encoding a Macaca mulatta AR 
protein wherein said protein comprises the amino acid sequence as follows: 
MEVQLGLGRV YPRPPSKTYR GAFQNLFQSV REVIQNPGPR HPEAASAAPP 
GASLQQQQQQ QQETSPRQQQ QQQQGEDGSP QAHRRGPTGY LVLDEEQQPS 
QPQSAPECHP ERGCVPEPGA AVAAGKGLPQ QLPAPPDEDD SAAPSTLSLL 
GPTFPGLSSC SADLKDILSE ASTMQLLQQQ QQEAVSEGSS SGRAREASGA 
PTSSKDNYLG GTSTISDSAK ELCKAVSVSM GLGVEALEHL SPGEQLRGDC 
MYAPVLGVPP AVRPTPCAPL AECKGSLLDD SAGKSTEDTA EYSPFKGGYT 
KGLEGESLGC SGSAAAGSSG TLELPSTLSL YKSGALDEAA AYQSRDYYNF 
PLALAGPPPP PPPPHPHARI KLENPLDYGS AWAAAAAQCR YGDLASLHGA 
GAAGPGSGSP SAAASSSWHT LFTAEEGQLY GPCGGGGGGG GGGGGGAGEA 
GAVAPYGYTR PPQGLAGQE6 DFTAPDVWYP G6MVSRVPYP SPTCVKSEMG 
PWMDSYSGPY GDMRLETARD HVLPIDYYFP PQKTCLICGD EASGCHYGAL 
TCGSCKVFFK RAAEGKQKYL CASRNDCTID KFRRKNCPSC RLRKCYEAGM 
TLGARKLKKL GNLKLQEEGE ASSTTSPTEE TAQKLTVSHI EGYECQPIFL 
NVLEAIEPGV VCAGHDNNQP DSFAALLSSL NELGERQLVH WKWAKALPG 
FRNLHVDDQM AVIQYSWMGL MVFAMGWRSF TNVNSRMLYF APDLVFNEYR 
MHKSRMYSQC VRMRHLSQEF GWLQITPQEF LCMKALLLFS IIPVDGLKNQ 
KFFDELRMNY IKELDRIIAC KRKNPTSCSR RFYQLTKLLD SVQPIARELH 
QFTFDLLIKS HMVSVDFPEM MAEIISVQVP KILSGKVKPI YFHTQ, aS Set forth in 

three-letter abbreviation in SEQ ID N0:4. 

30. A DNA expression vector for expressing a Macaca mulatta AR 
protein in a recombinant host ceU wherein said expression vector comprises a DNA 
molecule of Claim 29. 

30 

31. A host cell which expresses a recombinant Macaca mulatta AR 
protein wherein said host cell contains the DNA expression vector of Claim 30. 
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32- A process for expressing a Macaca mulatta AR protein in a 
recombinant host cell, comprising: 

(a) transfecting the expression vector of Claim 30 into a 

suitable host cell; and 

5 (b) culturing the host cells of step (a) under conditions which 

allow expression of said the Macaca mulatta AR protein from said DNA expression 
vector. 

33. The process according to Claim 32 wherein the host cell is a 

10 yeast host cell. 

34. A purified DNA molecule encoding a Macaca mulatta AR 
protein wherein said protein consists of the amino acid sequence as follows: 

MEVQLGLGRV YPRPPSKTYR GAFQNLFQSV REVIQNPGPR HPEAASAAPP 

15 GASLQQQQQQ QQETSPRQQQ QQQQGEDGSP QAHRRGPTGY LVLDEEQQPS 
QPQSAPECHP ERGCVPEPGA AVAAGKGLPQ QLPAPPDEDD SAAPSTLSLL 
GPTFPGLSSC SADLKDILSE ASTMQLLQQQ QQEAVSEGSS SGRAREAS6A 
PTSSKDNYLG 6TSTISDSAK ELCKAVSVSM GLGVEALEHL SPGEQLRGDC 
MYAPVLGVPP AVRPTPCAPL AECKGSLLDD SAGKSTEDTA EYSPFKGGYT 

20 KGLEGESLGC SGSAiy^GSSG TLELPSTLSL YKSGALDEAA AYQSRDYYNF 
PLALAGPPPP PPPPHPHARI KLENPLDYGS AWAAAAAQCR YGDLASLHGA 
GAAGPGSGSP SAAASSSWHT LFTAEEGQLY GPCGGGGGGG GGGGGGAGEA 
GAVAPYGYTR PPQGLAGQEG DFTAPDVWYP GGMVSRVPYP SPTCVKSEMG 
PWMDSYSGPY GDMRLETARD HVLPIDYYFP PQKTCLICGD EASGCHYGAL 

25 TCGSCKVFFK RAAEGKQKYL CASRNDCTID KFRRKNCPSC RLRKCYEAGM 
TLGARKLKKL GNLKLQEEGE ASSTTSPTEE TAQKLTVSHI EGYECQPIFL 
NVLEAIEPGV VCAGHDNNQP DSFAALLSSL NELGERQLVH WKWAKALPG 
FRNLHVDDQM AVIQYSWMGL MVFAMGWRSF TNVNSRMLYF APDLVFNEYR 
MHKSRMYSQC VRMRHLSQEF GWLQITPQEF LCMKALLLPS IIPVDGLKNQ 

30 KFFDELRMNY IKELDRIIAC KRKNPTSCSR RFYQLTKLLD SVQPIARELH 

QFTPDLLIKS HM7SVDFPEM MAEIISVQVP KILSGKVKPI YFHTQ, as set forth in 
three-letter abbreviation in SEQ ID N0:4. 
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35. A DNA expression vector for expressing a Macaca mulatta AR 
protein in a recombinant host cell wherein said expression vector comprises a DNA 
molecule of Claim 34. 

5 36. A host cell which expresses a recombinant Macaca mulatta AR 

protein wherein said host cell contains the expression vector of Claim 35. 

37. A process for expressing a Macaca mulatta AR protein in a 
recombinant host cell, comprising: 
10 (a) transfecting the expression vector of Claim 35 into a 

suitable host cell; and 

(b) culturing the host cells of step (a) under conditions which 
allow expression of said the Macaca mulatta AR protein from said expression vector. 

15 38. The process according to Claim 37 wherein the host cell is a 

yeast host cell. 

39. A purified DNA molecule encoding a Macaca mulatta AR 
protein wherein said DNA molecule comprises the nucleotide sequence, as follows: 

20 CCCAAAAAAT AAAAACAAAC AAAAACAAAA CAAAACAAAA AAAACGAATA 
AAGAZ^AAAGG TAATAACTCA GTTCTTATTT GCACCTACTT CCAGTGGACA 
CTGAATTTGG AAGGTGGAGG ATTCTTGTTT TTTCTTTTAA GATCGGGCAT 
CTTTTGAATC TACCCCTCAA GTGTTAAGAG ACAGACTGTG AGCCTAGCAG 
GGCAGATCTT GTCCACCGTG TGTCTTCTTT TGCAGGAGAC TTTGAGGCTG 

25 TCAGAGCGCT TTTTGCGTGG TTGCTCCCGC AAGTTTCCTT CTCTGGAGCT 
TCCCGCA6GT G6GCAGCTAG CTGCAGCGAC TACCGCATCA TCACAGCCTG 
TTGAACTCTT CTGAGCAAGA GAAGGGGAGG CGGGGTAAGG GAAGTAGGTG 
GAAGATTCAG CCAAGCTCAA GGATGGAGGT GCAGTTAGGG CTGGGGAGGG 
TCTACCCTCG GCCGCCGTCC AAGACCTACC GAGGAGCTTT CCAGAATCTG 

30 TTCCAGAGCG TGCGC6AAGT GATCCAGAAC CCGGGCCCCA GGCACCCAGA 
GGCCGCGAGC GCAGCACCTC CCGGCGCCAG TTTGCAGCAG CAGCAGCAGC 
AGCAGCAAGA AACTAGCCCC CGGCAACAGC AGCAGCAGCA GCAGGGTGAG 
GATGGTTCTC CCCAAGCCCA TCGTAGAGGC CCCACAGGCT ACCTGGTCCT 
GGATGAGGAA CAGCAGCCTT CACAGCCTCA GTCAGCCCCG GAGTGCCACC 
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CCGA6AGAGG 


TTGCGTCCCA 


GAGCCTGGAG 


CCGCCGTGGC 


CGCCGGCAAG 




GGGCTGCCGC 


AGCAGCTGCC 


AGCACCTCCG 


GACGAGGATG 


ACTCAGCTGC 




CCCATCCACG 


TTGTCTCTGC 


TGGGCCCCAC 


TTTCCCCGGC 


TTAAGCA6CT 




GCTCCGCCGA 


CCTTAAAGAC 


ATCCTGAGCG 


AGGCCAGCAC 


CATGCAACTC 


5 


CTTCAGCAAC 


AGCAGCAGGA 


AGCAGTATCC 


GAAGGCAGCA 


GCAGCGGGAG 




AGCGAGGGAG 


GCCTCGGGGG 


CTCCCACTTC 


CTCCAAGGAC 


AATTACTTAG 




GGGGCACTTC 


GACCATTTCT 


GACAGCGCCA 


AGGAGCTGTG 


TAAGGCAGTG 




TCGGTGTCCA 


TGGGCTTGGG 


TGTGGAGGCG 


TTGGAGCATC 


TGAGTCCAGG 




GGAACAGCTT 


CGGGGGGATT 


GCATGTACGC 


CCCAGTTTTG 


GGAGTTCCAC 


10 


CCGCTGTGCG 


TCCCACTCCG 


TGTGCCCCAT 


TGGCCGAATG 


CAAAGGTTCT 




CTGCTAGACG 


ACAGCGCAGG 


CAAGAGCACT 


GAAGATACTG 


CTGAGTATTC 




CCCTTTCAAG 


GGAGGTTACA 


CCAAAGGGCT 


AGAAGGCGAG 


AGCCTAGGCT 




6CTCTGGCAG 


CGCTGCAGCA 


GGGAGCTCCG 


GGACACTTGA 


ACTGCCGTCC 




ACCCTGTCTC 


TCTACAAGTC 


CGGAGCACTG 


GACGAGGCAG 


CTGCGTACCA 


15 


6AGTCGCGAC 


TACTACAACT 


TTCCACTGGC 


TCTGGCCGGG 


CCGCCGCCCC 




CTCCACCGCC 


TCCCCATCCC 


CACGCTCGCA 


TCAAGCTGGA 


GAACCCGCTG 




6ACTATGGCA 


GCGCCTGGGC 


GGCTGCGGCG 


GCGCAGTGCC 


GCTATGGGGA 




CCTGGCGAGC 


CTGCATGGCG 


CGGGTGCAGC 


GGGACCCGGC 


TCTGGGTCAC 




CCTCAGCGGC 


CGCTTCCTCA 


TCCTGGCACA 


CTGTCTTCAC 


AGCCGAA6AA 


20 


GGCCAGTTGT 


ATGGACCGTG 


TGGTGGTGGG 


GGCGGCGGCG 


GTGGCGGCGG 




CGGCGGCGGC 


GCAGGCGAGG 


CGGGAGCTGT 


AGCCCCCTAC 


GGCTACACTC 




GGCCACCTCA 


GGGGCTGGCG 


GGCCAGGAAG 


GCGACTTCAC 


CGCACCTGAT 




GTGTGGTACC 


CTGGCGGCAT 


GGTGAGCAGA 


GTGCCCTATC 


CCAGTCCCAC 




TTGTGTCAAA 


AGCGAGATG6 


GCCCCTGGAT 


GGATAGCTAC 


TCCGGACCTT 


25 


ACGGGGACAT 


GCGTTTGGAG 


ACTGCCAGGG 


ACCATGTTTT 


GCCAATTGAC 




TATTACTTTC 


CACCCCAGAA 


GACCTGCCTG 


ATCTGTGGAG 


ATGAAGCTTC 




TGGGTGTCAC 


TATGGAGCTC 


TCACATGTGG 


AAGCTGCAAG 


GTCTTCTTCA 




AAAGAGCCGC 


TGAAGGGAAA 


CAGAAGTACC 


TGTGTGCCAG 


CAGAAATGAT 




TGCACTATTG 


ATAAATTCCG 


AAGGAAAAAT 


TGTCCATCTT 


GCCGTCTTCG 


30 


6AAATGTTAT 


GAAGCAGGGA 


TGACTCTGGG 


AGCCCGGAAG 


CTGAAGAAAC 




TTGGTAATCT 


GAAACTACAG 


GAGGAAGGAG 


AGGCTTCCAG 


CACCACCAGC 




CCCACTGAGG 


AGACAGCCCA 


GAAGCTGACA 


GTGTCACACA 


TTGAAGGCTA 




TGAATGTCAG 


CCCATCTTTC 


TGAATGTCCT 


GGAGGCCATT 


GAGCCAGGTG 




TGGTGTGTGC 


TGGACATGAC 


AACAACCAGC 


CCGACTCCTT 


CGCAGCCTTG 
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CTCTCTAGCC TCAATGAACT GGGAGAGAGA CAGCTTGTAC ATGTGGTCAA 
GTGGGCCAAG GCCTTGCCTG GCTTCCGCAA CTTACACGTG GACGACCAGA 
TGGCTGTCAT TCAGTACTCC TGGATGGGGC TCATGGTGTT TGCCATGGGC 
TGGCGATCCT TCACCi^TGT CAACTCCAGG ATGCTCTACT TTGCCCCTGA 
5 TCTGGTTTTC AATGAGTACC GCATGCACAA ATCCCGGATG TACAGCCAGT 
GTGTCCGAAT GAGGCACCTC TCTCAAGAGT TTGGATGGCT CCAAATCACC 
CCCCAGGAAT TCCTGTGCAT GAAAGCGCTG CTACTCTTCA GCATTATTCC 
AGTGGATGGG CTGAAAAATC AAAAATTCTT TGATGAACTT CGAATGAACT 
ACATCAAGGA ACTCGATCGT ATCATTGCAT GCAAAAGAAA AAATCCCACA 

10 TCCTGCTCAA GGCGTTTCTA CCAGCTCACC AAGCTCCTGG ACTCCGTGCA 
GCCTATTGCG AGAGAGCTGC ATCAGTTCAC TTTTGACCTG CTAATCAAGT 
CACACATGGT GAGCGTGGAC TTTCCGGAAA TGATGGCAGA GATCATCTCT 
GTGCAAGTGC CCAAGATCCT TTCTGGGAAA GTCAAGCCCA TCTATTTCCA 
CACCCAGTGA AGCATTGGAA ATCCCTATTT CCTCACCCCA GCTCATGCCC 

15 CCTTTCAGAT GTCTTCTGCC TGTTA, set forth as SEQ ID N0:3. 

40. A DNA molecule of Claim 39 which consists of nucleotide 154 
to about nucleotide 1257 of SEQ ID NO: 3. 



20 41 . An expression vector for expressing a Macaca mulatta AR 

protein wherein said expression vector comprises a DNA molecule of Claim 39, 

42. An expression vector for expressing a Macaca mulatta AR 
protein wherein said expression vector comprises a DNA molecule of Claim 40. 

25 

43. A host cell which expresses a recombinant Macaca mulatta AR 
protein wherein said host cell contains the expression vector of Claim 41. 



44. A host cell which expresses a recombinant Macaca mulatta AR 
30 protein wherein said host cell contains the expression vector of Claim 42. 

45. A process for expressing a Macaca mulatta AR protein in a 
recombinant host cell, comprising: 
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(a) transfecting the expression vector of Claim 41 into a 

suitable host cell; and 

(b) culturing the host cells of step (a) under conditions which 
allow expression of said the Macaca mulatta AR protein from said expression vector. 

5 

46. The process according to Claim 45 wherein the host cell is a 

yeast host cell. 

47. A purified DNA molecule encoding a Macaca mulatta AR 
10 protein wherein said DNA molecule consists of the nucleotide sequence, as follows, 

CCCAAAAAAT AAAAACAAAC AAAAACAAAA CAAAACAAAA AAAACGAATA 
AAGAAAAAGG TAATAACTCA GTTCTTATTT GCACCTACTT CCAGTGGACA 
CTGAATTTGG AAGGTGGAGG ATTCTTGTTT TTTCTTTTAA GATCGGGCAT 
CTTTTGAATC TACCCCTCAA GTGTTAAGAG ACAGACTGTG AGCCTAGCAG 

15 GGCAGATCTT GTCCACCGTG TGTCTTCTTT TGCAGGAGAC TTTGAGGCTG 
TCAGAGCGCT TTTTGCGTGG TTGCTCCCGC AAGTTTCCTT CTCTGGAGCT 
TCCCGCAGGT GGGCAGCTAG CTGCAGCGAC TACCGCATCA TCACAGCCTG 
TTGAACTCTT CTGAGCAAGA GAAGGGGAGG CGGGGTAAGG GAAGTAGGTG 
GAAGATTCAG CCAAGCTCAA GGATGGAGGT GCAGTTAGGG CTGGGGAGGG 

20 TCTACCCTCG GCCGCCGTCC AAGACCTACC GAGGAGCTTT CCAGAATCTG 

ttccagagcg tgcgcgaagt gatccagaac ccgggcccca 6gcacccaga 
ggccgcgagc gcagcacctc ccggcgccag tttgcagcag cagcagcagc 
agcagcaaga aactagcccc cggcaacagc agcagcagca gcagggtgag 
gatggttctc cgcaagccca tcgtagaggc cccacaggct acctggtcct 

25 ggatgaggaa cagcagcctt cacagcctca gtcagccccg gagtgccacc 
ccgagagagg ttgcgtccca gagcctgga6 ccgcc6tggc cgccggcaag 
gggctgccgc agcagctgcc agcacctcc6 gacgaggatg actcagctgc 
cccatccacg ttgtctctgc tgggccccac tttccccggc ttaagcagct 
gctccgccga ccttaaagac atcctgagcg aggccagcac catgcaactc 

30 cttcagcaac agcagcagga agcagtatcc gaaggcagca gcagcgggag 
agcgagggag gcctcggggg ctcccacttc ctccaaggac aattacttag 
ggggcacttc gaccatttct gaca6cgcca aggagctgtg taaggcagtg 
tcggtgtcca tgggcttggg tgtggaggcg ttggagcatc tgagtccagg 
ggaacagctt cggggggatt gcatgtacgc cccagttttg gga6ttccac 
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CCGCTGTGCG 


TCCCACTCCG 


TGTGCCCCAT 


TGGCCGAATG 


CAAAGGTTCT 




CTGCTAGACG 


ACAGCGCAGG 


CAAGAGCACT 


GAAGATACTG 


CTGAGTATTC 




CCCTTTCAAG 


GGAGGTTACA 


CCAAAGGGCT 


AGAAGGCGAG 


AGCCTAGGCT 




GCTCTGGCAG 


CGCTGCAGCA 


GGGAGCTCCG 


GGACACTTGA 


ACTGCCGTCC 


5 


ACCCTGTCTC 


TCTACAAGTC 


CGGAGCACTG 


GACGAGGCAG 


CTGCGTACCA 




GAGTCGCGAC 


TACTACAACT 


TTCCACTGGC 


TCTGGCCGGG 


CCGCCGCCCC 




CTCCACCGCC 


TCCCCATCCC 


CACGCTCGCA 


TCAAGCTGGA 


GAACCCGCTG 




GACTATGGCA 


GCGCCTGGGC 


GGCTGCGGCG 


GCGCAGTGCC 


GCTATGGGGA 




CCTGGCGAGC 


CTGCATGGCG 


CGGGTGCAGC 


GGGACCCGGC 


TCTGGGTCAC 


10 


CCTCAGCGGC 


CGCTTCCTCA 


TCCTGGCACA 


CTCTCTTCAC 


AGCCGAAGAA 




GGCCAGTTGT 


ATGGACCGTG 


TGGTGGTGGG 


GGCGGCGGCG 


GTGGCGGCGG 




CGGCGGCGGC 


GCAGGCGAGG 


CGGGAGCTGT 


AGCCCCCTAC 


GGCTACACTC 




GGCCACCTCA 


GGGGCTGGCG 


GGCCAGGAAG 


GCGACTTCAC 


CGCACCTGAT 




GTGTGGTACC 


CTGGCGGCAT 


GGTGAGCAGA 


GTGCCCTATC 


CCAGTCCCAC 


15 


TTGTGTCAAA 


AGCGAGATGG 


GCCCCTGGAT 


GGATAGCTAC 


TCCGGACCTT 




ACGGGGACAT 


GCGTTTGGAG 


ACTGCCAGGG 


ACCATGTTTT 


GCCAATTGAC 




TATTACTTTC 


CACCCCAGAA 


GACCTGCCTG 


ATCTGTGGAG 


ATGAAGCTTC 




TGGGTGTCAC 


TATGGAGCTC 


TCACATGTGG 


AAGCTGCl^G 


GTCTTCTTCA 




AAAGAGCCGC 


TGAAGGGAAA 


CAGAAGTACC 


TGTGTGCCAG 


CAGAAATGAT 


20 


TGCACTATTG 


ATAAATTCCG 


AAGGAAA?^T 


TGTCCATCTT 


6CCGTCTTCG 




GAAATGTTAT 


GAAGCAGGGA 


TGACTCTGGG 


AGCCCGGAAG 


CTGAAGAAAC 




TTGGTAATCT 


GAAACTACAG 


GAGGAAGGAG 


AGGCTTCCAG 


CACCACCAGC 




CCCACTGAGG 


AGACAGCCCA 


GAAGCTGACA 


GTGTCACACA 


TTGAAGGCTA 




TGAATGTCA6 


CCCATCTTTC 


TGAATGTCCT 


GGAGGCCATT 


GAGCCAGGTG 


25 


TGGTGTGTGC 


TGGACATGAC 


AACAACCAGC 


CCGACTCCTT 


CGCAGCCTTG 




CTCTCTAGCC 


TCAATGAACT 


GGGAGAGAGA 


CAGCTTGTAC 


ATGTGGTCAA 




GTGGGCCAAG 


GCCTTGCCTG 


GCTTCCGCAA 


CTTACACGTG 


GACGACCAGA 




TGGCTGTCAT 


TCAGTACTCC 


TGGATGGGGC 


TCATGGTGTT 


TGCCATGGGC 




TGGCGATCCT 


TCACCAATGT 


CAACTCCAGG 


ATGCTCTACT 


TTGCCCCTGA 


30 


TCTGGTTTTC 


AATGAGTACC 


GCATGCACAA 


ATCCCGGATG 


TACAGCCAGT 




GTGTCCGAAT 


GAGGCACCTC 


TCTCAAGAGT 


TTGGATGGCT 


CCAAATCACC 




CCCCAGGAAT 


TCCTGTGCAT 


GAAAGCGCTG 


CTACTCTTCA 


GCATTATTCC 




AGTGGATGGG 


CTGAAAAATC 


AAAAATTCTT 


TGATGAACTT 


CGAATGAACT 




ACATCAAGGA 


ACTCGATCGT 


ATCATTGCAT 


GCAAAAGAAA 


AAATCCCACA 
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TCCTGCTCAA GGCGTTTCTA CCA6CTCACC AAGCTCCTGG ACTCCGT6CA 
GCCTATTGCG AGAGAGCTGC ATCA6TTCAC TTTTGACCT6 CTAATCAAGT 
CACACATGGT GAGCGTGGAC TTTCCGGAAA TGATGGCAGA GATCATCTCT 
6TGCAAGTGC CCAAGATCCT TTCTGGGAAA GTCAAGCCCA TCTATTTCCA 
CACCCAGTGA AGCATTGGAA ATCCCTATTT CCTCACCCCA GCTCATGCCC 
CCTTTCAGAT GTCTTCTGCC TGTTA, as set forth in SEQ ID NO: 3. 



48. A DNA molecule of Claim 47 which consists of nucleotide 423 
to about nucleotide 3108 of SEQ ID NO: 3. 

10 

49. A DNA expression vector for expressing a Macaca mxdatta AR 
protein wherein said expression vector comprises a DNA molecule of Claim 47. 

50. A DNA expression vector for expressing a Macaca mulatta 
15 AR protein wherein said expression vector comprises a DNA molecule of Claim 48. 

51. A host cell which expresses a recombinant Macaca mulatta AR 
protein wherein said host cell contains the expression vector of Claim 44. 

20 52. A host cell which expresses a recombinant Macaca mulatta AR 

protein wherein said host cell contains the expression vector of Claim 45. 

53. A process for expressing a Macaca mulatta AR protein in a 
recombinant host cell, comprising: 
25 (a) transfecting the expression vector of Claim 49 into a 

suitable host cell; and, 

(b) culturing the host cells of step (a) under conditions which 
allow expression of said the Macaca mulatta AR protein from said expression vector. 

30 54. The process according to Claim 53 wherein the host cell is a 

yeast host cell. 

55. A purified Macaca mulatta AR protein which comprises the 
amino acid sequence as set forth in SEQ ID NO: 4. 
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56. A purified Macaca mulatta AR protein which consists of the 
amino acid sequence as set forth in SEQ ID NO: 4. 

5 57. A purified Macaca mulatta AR protein derived from a host cell 

transfected with a DNA expression vector which comprises the nucleotide sequence 
as set forth in SEQ ID NO:3. 

58. A purified Macaca mulatta AR protein of Claim 57 wherein 
10 said DNA expression vector contains from about nucleotide 423 to about nucleotide 
3108ofSEQIDNO:3. 
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1 


CCCAAAAAAT 


AAAAACAAAC 


AAAAACAAAA 


CAAAACAAAA 


AAAACGAATA 


51 


MGAAAAAGG 


TAATAACTCA 


GTrcnATTT 


GCACCTACTT 


CCAGTGGACA 


101 


CTGAATTTGG 


AAGGTGGAGG 


ATTCTTGnr 


TTTCIIIIAA 


GATCGGGCAT 


151 


CIIIIGAATC 


TACCCCTCAA 


GTGTTAAGAG 


ACAGACTGTG 


AGCCTAGCAG 


201 


GGCAGATCTT 


GTCCACCGTG 


TGTCTTCnr 


TGCAGGAGAC 


TTTGAGGCTG 


251 


TCAGAGCQCT 


IlilGCGTGG 


TTGCTCCCGC 


AAGnrCCTT 


CTCTGGAGCT 


301 


TCCCGCAGGT 


GGGCAGCTAG 


CTGCAGCGAC 


TACCGCATCA 


TCACAGCCTG 


351 


TTGAACTCn 


CTGAGCAAGA 


GAAGGGGAGG 


CGGGGTAAGG 


GAAGTAGGTG 


401 


GAAGATTCAG 


CCAAGCTCAA 


GGAT6GAGGT 


GCAGHAGGG 


CTG6GGAGGG 


451 


TCTACCCTCG 


GCCGCCGTCC 


AAGACCTACC 


GAGGAGCTTT 


CCAGAATCTG 


501 


TTCCA6AGCG 


TGCGCGAAGT 


GATCCAGAAC 


CCGGGCCCCA 


GGCACCCAGA 


551 


GGCCGCGAGC 


GCAGCACCTC 


CCG6CGCCAG 


IIIGCAGCAG 


CAGCAGCAGC 


601 


AGCAGCAAGA 


AACTAGCCCC 


C6GCAACAGC 


AGCAGCAGCA 


GCAGGGTGAG 


651 


GATGGTTCTC 


CCCAAGCCCA 


TCGTAGAGGC 


CCCACAGGCT 


ACCTGGTCCT 


701 


GGATGAGGAA 


CAGCAGCCTT 


CACAGCCTCA 


GTCAGCCCCG 


GAGTGCCACC 


751 


CCGAGAGAGG 


TT6CGTCCCA 


GAGCCTGGA6 


CCGCCGTGGC 


CGCCGGCAAG 


801 


GGGCTGCC6C 


A6CAGCTGCC 


AGCACCTCCG 


GACGAGGATG 


ACTCAGCTGC 


851 


CCCATCCACG 


HGTCTCTGC 


TG6GCCCCAC 


TTTCCCCGGC 


TTAAGCAGCT 


901 


GCTCCGCCGA 


CCTTAAAGAC 


ATCCTGAGCG 


AGGCCAGCAC 


CATGCAACTC 


951 


CTTCAGCAAC 


AGCAGCAGGA 


AGCAGTATCC 


6AAGGCAGCA 


GCAGCGGGAG 


1001 


AGCGAGGGAG 


GCCTCGGGGG 


CTCCCACTTC 


CTCCAAGGAC 


AAHACHAG 


1051 


.AGGGCACTTC 


GACCATTTCT 


GACAGCGCCA 


AGGAGCTGTG 


TAAGGCAGTG 


1101 


TCGGTGTCCA 


TGGGCTTGGG 


TGT6GAGGCG 


TT6GAGCAJC 


TGAGTCCAGG 


1151 


GGAACAGCn 


CGG6GGGATT 


GCATGTACGC 


CCCAGIIIIG 


GGAGTTCCAC 


1201 


CCGCTGTGC6 


TCCCACTCCG 


TGT6CCCCAT 


TGGCCGAATG 


CAAAGGTTCT 
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1251 


CTGCTAGACG 


ACAGCGCAGG 


CAAGAGCACT GAAGATACTG 


CTGAGTATTC 


1301 


CCCTTTCAAG 


GGAGGTOCA 


CCAAAGGGCT AGAAGGCGAG 


AGCCTAGGCT 


1351 


GCTCTGGCAG 


CGCTGCAGCA 


6GGAGCTCCG GGACACTTGA 


ACTGCCGTCC 


1401 


ACCCTGTCTC 


TCTACAA6TC 


CGGAGCACTG GACGAGGCAG 


CTGCGTACCA 


1451 


GAGTCGC6AC 


TACTACAACT 


TTCCACTG6C TCTGGCCGGG 


CCGCCGCCCC 


1501 


CTCCACCGCC 


TCCCCATCCC 


CACGCTCGCA TCAAGCTGGA 


GAACCCGCTG 


1551 


GACTATGGCA 


GC6CCTGGGC 


GGCTGCG6CG GCGCAGTGCC 


GCTATGGGGA 


1601 


CCTG6CGAGC 


CTGCATGGCG 


CGGGTGCA6C GGGACCCGGC 


TCTGGGTCAC 


1651 


CCTCAGCGGC 


CGCTTCCTCA 


TCCTGGCACA CTCTCTTCAC 


AGCCGAAGAA 


1701 


GGCCAGHGT 


ATGGACCGTG 


TG6TGGTGGG GGCGGCGGCG 


GTGGCGGCGG 


1751 


C6GCGGCGGC 


GCAGGCGAGG 


CG6GAGCTGT AGCCCCCTAC 


GGCTACACTC 


1801 


GGCCACCTCA 


6GGGCTG6CG 


GGCCAGGAAG 6CGACTTCAC 


CGCACCTGAT 


1851 


GTGTGGTACC 


CTGGCG6CAT 


GGTGAGCAGA GTGCCCTATC 


CCAGTCCCAC 


1901 


TTGTGTCAAA 


AGCGAGATGG 


GCCCCTGGAT GGATAGCTAC 


TCCGGACCTT 


1951 


ACGGGGACAT 


GCGTTTGGAG 


ACTGCCAGGG ACCATGIIII 


GCCAATTGAC 


2001 


TAnACTTTC 


CACCCCAGAA 


GACCTGCCTG ATCTGTGGAG 


ATGAAGCTTC 


2051 


TGGGTGTCAC 


TATGGAGCTC 


TCACATGTGG AAGCTGCAAG 


GTCTTCTTCA 


2101 


AAAGAGCCGC 


TGAAGGGAAA 


CAGAAGTACC TGTGTGCCAG 


CAGAAATGAT 


2151 


TGCACTATTG 


ATAAATTCCG 


AAGGAAAAAT TGTCCATCTT 


GCCGTCnCG 


2201 


GAAATGTTAT 


GAAGCAGGGA 


TGACTCTGGG AGCCCGGAAG 


CTGAAGAAAC 


2251 


TTGGTAATCT 


GAAACTACAG 


GAGGAAGGAG AGGCHCCAG 


CACCACCAGC 


2301 


CCCACTGAGG 


AGACAGCCCA 


GAAGCTGACA GTGTCACACA 


TTGAAGGCTA 


2351 


TGAATGTCAG 


CCCATCTTTC 


TGAATGTCCT GGAGGCCATT 


GAGCCAGGTG 


2401 


TGGTGTGTGC 


TGGACATGAC 


AACAACCAGC CCGACTCCTT 


CGCAGCCTTG 


2451 


CTCTCTAGCC 


TCAAT6AACT 


GGGAGAGAGA CA6CTTGTAC 


ATGTGGTCAA 
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2501 


GTGGGCCAAG 


GCCTT6CCTG 


GCTTCCGCAA 


CTTACACGTG 


GACGACCAGA 


2551 


TG6CT6TCAT 


TCAGTACTCC 


TGGATGGGGC 


TCATGGTGTT 


TGCCATG6GC 


2601 


TGGCGATCCT 


TCACCAAT6T 


CAACTCCAGG 


ATGCTCTACT 


TTGCCCCTGA 


2651 


TCTGGTTTTC 


AATGAGTACC 


GCAT6CACAA 


ATCCCGGATG 


TACAGCCAGT 


2701 


GTGTCCGMT 


GAGGCACCTC 


TCTCAAGAGT 


TTGGATGGCT 


CCAAATCACC 


2751 


CCCCA6GAAT 


TCCTGTGCAT 


GAAA6CGCTG 


CTACTCTTCA 


GCATTATTCC 


2801 


AGTGGATGGG 


CTGAAAAATC 


AAAAATTCn 


TGATGAACTT 


CGAATGAACT 


2851 


ACATCAA6GA 


ACTCGATCGT 


ATCATTGCAT 


GCAAAAGAAA 


AAATCCCACA 


2901 


TCCT6CTCAA 


GGCGTTTCTA 


CCAGCTCACC 


AAGCTCCT6G 


ACTCCGTGCA 


2951 


GCCTATTGCG 


AGAGAGCTGC 


ATCAGTTCAC 


TTTTGACCTG 


CTAATCAAGT 


3001 


CACACATGGT 


GA6CGTGGAC 


1 IICCGGAAA 


TGATGGCAGA 


GATCATCTCT 


3051 


GTGCAAGTGC 


CCAAGATCCT 


TTCTGGGAAA 


GTCAAGCCCA 


TCTATTTCCA 


3101 


CACCCAGTGA 


AGCAHGGAA 


ATCCCTATTT 


CCTCACCCCA 


GCTCATGCCC 


3151 


CCTTTCAGAT 


GTCTTCTGCC 


tg™ (SEQ 


ID N0:1) 
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1 
1 


rluvULuLciKV 


VDDPPCV'TVD 
T rKrroIx 1 TK 


bMrL(INLriJoV 


KLV lyiMrbrK 


HPFAAQAAPP 


01 




UUt. 1 orKUUU 


LiUUUbliUbor 


nAUDDPDTCY 
L/MnKKbr 1 b Y 


1 \/i HFFnnp^ 


iui 




tKuUvrCruM 


A\/AAC|^r:i DO 


ni DADonrnn 

ULrnrruCuU 


CAAP^TI ^1 1 
OMMrO 1 LOLL 


±01 


ur 1 rruLooL 


oAULKUiLot 


AQTMHi 1 nnn 

Mo 1 riULLL/UU 


iJUL.AVoLboo 


CCRARFA^f^A 


^Ul 


r 1 ool\UI>ITLt 


b 1 o 1 loUoMIx 


CI ri^AWcycM 

lLLInMVoVoI 1 


fl] C\/FAI FHI 
ULuV LMLLnL 


oruCV^LixClUU 


COL 


MVAD\/| r:\/DD 
rlYMr VLuVrr 




AFri^r;^! 1 nn 

MLoNboLLUU 


^^Af^k'^^TFRTA 
oMblxo 1 Lu 1 M 


L T O r r Nuv3 T 1 


oul 


NbLtbtoLbU 


Qf^crAAACCcc 


TI PI P^TI ^1 


Yk'^RAI RFAA 


AYO^RnVYNF 
M T yor\U 1 1 INr 


351 


PLALAGPPPP 


PPPPHPHARI 


KLENPLDYGS 


AWAAAAAQCR 


YGDLASLHGA 


401 


GAAGPGSGSP 


SAAASSSWHT 


LFTAEEGQLY 


GPCGGGGG6G 


GGG6GGAGEA 


451 


GAVAPYGYTR 


PPQGLAGQEG 


DFTAPDVWYP 


GGMVSRVPYP 


SPTCVKSEMG 


501 


PWMDSYSGPY 


GDMRLETARD 


HVLPIDYYFP 


POKTCLICGD 


EAS6CHY6AL 


551 


TCGSCKVFFK RAAEGKOKYL CASRNDCTID 


KFRRKNCPSC RLRKCYEAGM 



601 TLGARKLKKL GNLKLQEEGE ASSHSPTEE TAQKLTVSHI EGYECQPIFL 

651 NVLEAIEPGV VCAGHDNNQP DSFAALLSSL NELGERQLVH VVKWAKALPG 

701 FRNLHVDDQM AVIQYSWMGL MVFAMGWRSF TNVNSRMLYF APDLVFNEYR 

751 MHKSRMYSQC VRMRHLSQEF 6WLQITPQEF LCMKALLLFS IIPVD6LKNQ 

801 KFFDELRMNY IKELDRIIAC KRKNPTSCSR RFYQLTKLLD SVQPIARELH 

851 QFTFDLLIKS HMVSVDFPEM MAEIISVQVP KILSGKVKPI YFHTQ (SEQ ID N0:2) 
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CCCAAAAMTAAAMCAMCAAAMCAAMCAAMCAAAAAAMCGMTAMGAAAMGG 

1 + --+-- + + + + 60 

GGG 1 1 1 1 1 1 ATTmGmGTTITTGTTTTGTrTTGI 1 1 1 1 1 1 IGCTTATTTCTnTTCC 

TAATAACTCAGncnATTTGCACCTACTTCCAGTGGACACTGAATTTGGAAGGTGGAGG 

61 -- + + + + + + 120 

ATTATTGAGTCAAGAATAAACGTGGCTGAAGGTCACCTGTGACnAAACCnCCACCTCC 

ATTCTTGTTTTTTCTmAAGATCGGGCATCTmGAATCTACCCCTCM^ 

121 +---- + + + + + 180 

TAAGAACAAAAAAGAAAATTCTAGCCCGTAGAAAACTTAGATGGGGAGTTCACAAnCTC 

ACAGACTGTGAGCCTAGCAGGGCAGATCTTGTCCACCGTGTGTCnCTTTTGCAGGAGAC 

181 + + + + + + 240 

TGTCTGACACTCGGATCGTCCCGTCTAGAACAGGTGGCACACAGAAGAAAACGTCCTCTG 

mGAGGCTGTCAGAGCGCTTTTTGCGTGGnGCTCCCGCAAGTnCCTTCTCTGGAGCT 
241 +--- + + + + + 300 

AAACTCCGACAGTCTCGCGAAAAACGCACCAACGAGGGCGTTCAAAGGAAGAGACCTCGA 

TCCCGCAGGTGGGCAGCTAGCTGCAGCGACTACCGCATCATCACAGCCTGTTGAACTCTT 

301 + + + + -+ + 360 

AGGGCGTCCACCC6TCGATCGACGTCGCTGATG6C6TAGTAGTGTCGGACAACTTGAGAA 

CTGAGCAAGAGAAGGGGAGGCGGGGTAAGGGAAGTAGGTGGAAGATTCAGCCAAGCTCAA 
361 + + -..+-- + + + 420 

GACTCGTTCTCTTCCCCTCCGCCCCATTCCCnCATCCACCTTCTAAGTCGGTTCGAGTT 

GGATGGAGGTGCAGTTAGGGCTGGGGAGGGTCTACCCTCGGCCGCCGTCCAAGACCTACC 

421 -- + + + + +--- + 480 

CCTACCTCCACGTCAATCCCGACCCCTCCCAGATGGGAGCCGGCGGCAGGnCTGGATGG 
MEVQLGLGRVYPRPPSKTYR 

. GAGGAGCTTTCCAGAATCTGTTCCAGAGCGTGCGCGAAGTGATCCAGAACCC6GGCCCCA 

481 + +-- -+ + + + 540 

CTCCTCGAAAG6TCTTAGACAAGGTCTCGCACGCGCTTCACTAGGTCTTGGGCCCGGGGT 
GAFQNLFQSVREVIQNPGPR 

GGCACCCAGAGGCCGCGAGCGCAGCACCTCCCGGCGCCAGTTTGCAGCAGCAGCAGCAGC 
541 + + + + +- ---+ 600 

CCGTGGGTCTCCGGCGCTCGCGTCGTGGAGGGCCGCGGTCAAACGTCGTCGTCGTCGTCG 
HPEAASAAPPGASLQQQQQQ 

AGCAGCAAGAAACTAGCCCCCGGCAACAGCAGCAGCAGCAGCAGGGTGAGGATGGnCTC 

501 + + + + + + 660 

TCGTCGTTCTTTGATCGGGGGCCGTTGTCGTCGTCGTCGTCGTCCCACTCCTACCAAGAG 
QQETSPRQQQQQQQGEDGSP 

CCCAAGCCCATCGTAGAGGCCCCACAGGCTACCTGGTCCTGGATGAGGAACAGCAGCCTT 
661 + + + + + + 720 
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GGGTTCGGGTAGCATCTCCGGGGTGTCCGATGGACCAGGACCTACTCCnGTCGTCGGAA 
QAHRRGPTGYLVLDEEQQPS 

CACAGCCTCAGTCAGCCCCGGAGTGCCACCCCGAGAGAGGTTGCGTCCCAGAGCCTGGAG 

721 +- + + --+ + + 780 

GTGTCGGAGTCAGTCG6GGCCTCACGGTGGGGCTCTCTCCAACGCAGGGTCTC6GACCTC 
QPQSAPECHPERGCVPEPGA 

CC6CCGT6GCC6CCGGCAA6G6GCT6CC6CAGCAGCTGCCAGCACCTCCG6AC6A66ATG 

781 --- -+ + + + +--- + 840 

GGCGGCACCGGCGGCCGnCCCCGACGGCGTCGTCGACGGTCGTGGAGGCCTGCTCCTAC 
AVAAGKGLPQQLPAPPDEDD 

ACTCAGCTGCCCCATCCACGnGTCTCTGCTGGGCCCCACTTTCCCCGGCTTAAGCAGCT 

841 + + + + + + 900 

TGAGTC6ACG66GTAGGT6CAACAGAGACGACCC6G6GTGAAA6GG6CCGAATTC6TCGA 
SAAPSTLSLLGPTFPGLSSC 

GCTCCGCCGACCTTAAAGACATCCTGAGCGAGGCCAGCACCATGCAACTCCTTCAGCAAC 

901 -+ ---+ +-- --+ + + 960 

CGAGGCGGCTGGAATTTCTGTAGGACTCGCTCCGGTCGTGGTACGTTGAGGAAGTCGTTG 
SADLKDILSEASTMQLLQQQ 

AGCAGCAGGAAGCAGTATCCGAAGGCAGCAGCAGCGGGAGAGCGAGGGAGGCCTCGGGGG 

961 + + +-- -+ + + 102O 

TCGTCGTCCTTCGTCATAGGCTTCCGTCGTCGTCGCCCTCTCGCTCCCTCCGGAGCCCCC 
QQEAVSEGSSSGRAREAS6A 

CTCCCACTTCCTCCAAGGACAAnACTTAGAGGGCACTTCGACCATTTCTGACAGCGCCA 

1021 + + + ---+ +--- + 1080 

GAGGGTGAAGGA6GTTCCTGTTAATGAATCICCCGTGAAGCTGGTAAAGACTGTC6CGGT 
PTSSKDNYLEGTSTISDSAK 

AGGAGCTGTGTAAGGCAGTGTCGGTGTCCATGGGCTTGGGT6TGGAGGCGTTGGAGCATC 

1081 + +--- + + + + 1140 

TCCTCGACACATTCCGTCACAGCCACAGGTACCCGAACCCACACCTCCGCAACCTCGTAG 
ELCKAVSVSMGLGVEALEHL 

TGAGTCCAGGGGAACAGCTTCGGGGGGATTGCATGTACGCCCCAGTnTGGGAGTTCCAC 

1141 + + + + +--- -+ 1200 

ACTCAGGTCCCCnGTCGAAGCCCCCCTAACGTACATGCGGGGTCAAAACCCTCAAGGTG 
SPGEQLRGDCMYAPVLGVPP 

CCGCTGTGCGTCCCACTCCGTGTGCCCCATTGGCCGAATGCAAAGGnCTCTGCTAGACG 

1201 + -+ + + +- + 1260 

6GCGACAC6CA6GGTGAGGCACACGGGGTAACCGGCTTAC6TTTCCAA6AGAC6ATCTGC 
AVRPTPCAPLAECKGSLLDD 

ACAGCGCAGGCAAGAGCACTGAAGATACTGCTGAGTATTCCCCTTTCAAGGGAGGTTACA 
1261 + + + + + + 1320 
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TGTC6CGTCC6TTCTCGT6ACTTCTATGACGACTCATAAG6GGAAAGTTCCCTCCAATGT 
SAGKSTEDTAEYSPFKGGYT 

CCAAAG6GCTAGAAGGCGAGAGCCTAGGCTGCTCTG6CAGCGCTGCAGCAGGGAGCTCCG 

1321 ----+ --+-- + +-- --+ + 1380 

GGTTTCCCGATCTTCCGCTCTCGGATCCGACGAGACCGTCGCGACGTCGTCCCTCGAGGC 
KGLEGESLGCSGSAAAGSSG 

GGACACTTGAACTGCCGTCCACCCTGTCTCTCTACAAGTCCGGAGCACTGGACGAGGCAG 

1381 -- + +- --+ + + ---+ 1440 

CCTGTGAACTTGACGGCAGGTGGGACAGAGAGATGTTCAGGCCTCGTGACCTGCTCCGTC 
TLELPSTLSLYKSGALDEAA 

CTGCGTACCAGAGTCGCGACTACTACAACTTTCCACTGGCTCTGGCCGGGCCGCCGCCCC 
1441 +_ ...+ +.-- + --+ + 1500 

GACGCATGGTCTCAGCGCTGATGATGTTGAAAGGTGACCGAGACCGGCCCGGCGGCGGGG 
AYQSRDYYNFPLALAGPPPP 

CTCCACCGCCTCCCCATCCCCACGCTCGCATCAAGCTGGAGAACCCGCTGGACTATGGCA 

1501 -- --+ + + +-- --+ + 1560 

GAGGTGGCGGAGGGGTAGGGGTGCGAGCGTAGTTCGACCTCnGGGCGACCTGATACCGT 
PPPPHPHARIKLENPLDYGS 

GCGCCTGGGCGGCTGCGGCGGCGCAGTGCCGCTATGGGGACCTGGCGAGCCTGCATGGCG 

1561 ---+ + +--- + + + 1620 

CGCGGACCCGCCGACGCCGCCGCGTCACGGCGATACCCCTGGACCGCTCGGACGTACCGC 
AWAAAAAQCRYGDLASLHGA 

CGGGTGCAGCGGGACCCGGCTCTGGGTCACCCTCAGCGGCCGCTTCCTCATCCTGGCACA 

1621 -+ ---+ + + + + 1680 

GCCCACGTCGCCCTGGGCCGAGACCCAGTGGGAGTCGCCGGCGAAGGAGTAGGACCGTGT 
GAAGPGSGSPSAAASSSWHT 

CTCTCTTCACAGCCGAAGAAGGCCAGTTGTATGGACCGTGTGGTGGTGGGGGCGGCGGCG 

1681 + +-- + + + ---+ 1740 

GAGAGAAGTGTCGGCTTCTTCCGGTCAACATACCTGGCACACCACCACCCCCGCCGCCGC 
LFTAEE6QLYGPCGGGGGGG 

GTGGCGGCGGCGGCGGCGGCGCAGGCGAGGCGGGAGCTGTAGCCCCCTACGGCTACACTC 

1741 + +- + + --+- ---+ IBOO 

CACCGCCGCCGGCGCCGCC6CGTCCGCTCCGCCCTC6ACATCGGG6GATGCCGATGTGAG 
GGGGGGAGEAGAVAPYGYTR 

GGCCACCTCAGGGGCTGGCGGGCCAGGAAGGCGACnCACCGCACCTGATGTGTGGTACC 

1801 ---+ +- + + + + I860 

CCGGTGGAGTCCCCGACCGCCCGGTCCTTCCGCTGAAGTGGCGTGGACTACACACCATGG 
PPQGLAGQEGDFTAPDVWYP 

CTG6CG6CATGGT6A6CA6A6TGCCCTATCCCAGTCCCACTT6T6TCAAAA6CGAGATGG 
1861 - ---+ + + +- ---+ + 1920 
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GACCGCCGTACCACTCGTCTCACGGGATAGGGTCA6GGT6AACACAGTTTTCGCTCTACC 
GGMVSRVPYPSPTCVKSEMG 

GCCCCTGGATGGATAGCTACTCCGGACCTTACGGGGACATGCGTTTGGAGACTGCCAGGG 

1921 --+ + + + + + 1980 

CGG6GACCTACCTATCGATGAGGCCTGGAATGCCCCTGTACGCAAACCTCTGACGGTCCC 
PWMDSYSGPYGDMRLETARD 

ACCATGTTTrGCCAATTGACTATTACTrTCCACCCCAGAAGACCTGCCTGATCTGTGGAG 

1981 --+- + + + -+ + 2040 

TGGTACAAAACGGTTAACTGATAATGAAAGGTGGGGTCTTCTGGACGGACTAGACACCTC 
HVLPIDYYFPPQKT C L I C 6 D 

ATGAAGCTTCT6GGTGTCACTATGGAGCTCTCACATGT6GAAGCTGCAA66TCTTCTTCA 

2041 +- + -+ + +-- --+ 2100 

TACTTCGAAGACCCACAGTGATACCTCGAGAGTGTACACCTTCGACGnCCAGAAGAAGT 
EASGCHYGALTCGSCKVFFK 

AAAGAGCCGCTGAAGGGAAACAGAAGTACCTGTGTGCCAGCAGAAATGATTGCACTATTG 

2101 + + --+ + + + 2160 

rrTCTCGGCGACnCCCTTTGTCTTCATGGACACACGGTCGTCTTTACTAACGTGATAAC 
RAAEGKQKYLCASRNDCTID 

ATAAATTCCGAAGGAAAAATTGTCCATCnGCCGTCnCGGAAATGTTATGAAGCAGGGA 

2161 --+ +--- -+ +--- + + 2220 

TAmAAGGCTTCCTTTTTAACAGGTAGAACGGCAGAAGCCTTTACAATACTTCGTCCCT 
KFRRKNCPSCRLRKCYEAGM 

TGACTCTGGGAGCCCGGAAGCTGAAGAAACTTGGTAATCTGAAACTACAGGAGGAAGGAG 

2221 + + + + + + 2280 

ACTGAGACCCTCGGGCCmGACnCTTTGAACCATTAGACrrTGATGTCCTCCTTCCTC 
TLGARKLKKLGNLKLQEEGE 

AGGCTTCCAGCACCACCAGCCCCACTGAGGA6ACAGCCCAGAAGCTGACAGTGTCACACA 

2281 - ---+ +--- + + --+ + 2340 

TCCGAAGGTCGTGGTGGTCGGGGTGACTCCTCTGTCGGGTCTTCGACTGTCACAGTGTGT 
ASSTTSPTEETAQKLTVSHI 

TTGAAGGCTATGAATGTCAGCCCATCTTTCTGAATGTCCTGGAGGCCAnGAGCCAGGTG 

2341 + + + + + ---+ 2400 

AACTTCCGATACTTACAGTCGGGTAGAAAGACTTACAGGACCTCCGGTAACTCGGTCCAC 
EGYECQPIFLNVLEAIEPGV 

TGGTGTGTGCTGGACATGACAACAACCAGCCCGACTCCTTCGCAGCCTTGCTCTCTAGCC 

2401 + + + + + + 2460 

ACCACACACGACCTGTACTGnGnGGTCGGGCTGAGGAAGCGTCGGAACGAGAGATCGG 
VCAGHDNNQPDSFAALLSSL 

TCAATGAACTGGGAGAGAGACA6CTTGTACATGTGGTCAAGTGGGCCAAGGCCTTGCCTG 
2461 + ---+ + +- ---+ ---+ 2520 
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AGnACTTGACCCTCTCTCTGTCGAACATGTACACCAGTTCACCCGGTTCCGGAACGGAC 
NELGERQLVHVVKWAKALPG 

GCTTCCGCAACTTACACGTGGACGACCAGATGGCTGTCATTCAGTACTCCTGGATGGGGC 

2521 +. + +-- -+- --+ -+ 2580 

CGAAG6CGTTGAATGTGCACCTGCTGGTCTACCGACAGTAAGTCATGAG6ACCTACCCCG 
FRNLHVDDQMAVIQYSWMGL 

TCATGGTGTTTGCCATGGGCTGGC6ATCCTTCACCAAT6TCAACTCCAGGATGCTCTACT 

2581 + + + + + + 2640 

AGTACCACAAACGGTACCCGACCGCTAGGAAGTGGTTACAGTTGAGGTCCTACGAGATGA 
MVFAMGWRSFTNVNSRMLYF 

TTGCCCCTGATCTGGTnrCAATGAGTACCGCATGCACAAATCCCGGATGTACAGCCAGT 

2641 + + +--- -+ --+ + 2700 

AACGGGGACTAGACCAAAAGTTACTCATGGCGTACGTGTTTAGGGCCTACATGTCGGTCA 
APDLVFNEYRMHKSRMYSQC 

GTGTCCGAATGAGGCACCTCTCTCAAGAGTTTGGATGGCTCCAAATCACCCCCCAGGAAT 

2701 -- -+ + + + ---+ + 2760 

CACAGGCTTACTCCGTGGAGAGAGTTCTCAMCCTACCGAGGTTTAGTGGGGGGTCCTTA 
VRMRHLSQEFGWLQITPQEF 

TCCTGTGCATGAAAGCGCTGCTACTCTTCAGCATTATTCCAGTGGATGGGCTGAAAAATC 

2761 + + + + + + 2820 

AGGACACGTACmCGCGACGATGAGMGTCGTAATAAGGTCACCTACCCGACTTTTTAG 
LCMKALLLFSI.IPVDGLKNQ 

AAAMTTCTTTGATGAACTTC6AATGAACTACATCAAGGAACTCGATCGTATCATTGCAT 

2821 - + + + + + + 2880 

TTTTTAAGAAACTACTTGAAGCTTACTTGATGTAGnCCTTGAGCTAGCATAGTAACGTA 
KFFDELRMNYIKELDRIIAC 

GCAAAAGAAAAAATCCCACATCCTGCTCAAGGCGTTTCTACCAGCTCACCAAGCTCCTGG 

2881 + +— -+ + + -+ 2940 

CGTmCTTTTTTAGGGTGTAGGACGAGTTCCGCAAAGATGGTCGAGTGGTTCGAGGACC 
KRKNPTSCSRRFYQLTKLLD 

ACTCCGTGCAGCCTATTGCGAGAGAGCTGCATCAGTrCACTlTTGACCTGCTAATCAAGT 

2941 + + + + + + 3000 

TGAGGCACGTCGGATAACGCTCTCTCGACGTAGTCAAGTGAAAACTGGACGATTAGnCA 
SVQPIARELHQFTFDLLIKS 

CACACATGGTGAGCGTGGACTTTCCGGAAATGATGGCAGAGATCATCTCTGTGCAAGTGC 

3001 + + + ---+ -+-- + 3060 

GTGTGTACCACTC6CACCTGAAAGGCCTTTACTACCGTCTCTAGTAGAGACACGTTCACG 
HMVSVDFPEMMAEIISVQVP 

CCAAGATCCrrrCTGGGAAAGTCAAGCCCATCTATTTCCACACCCAGTGAAGCATTGGAA 
3061 -- -+ + + + + + 3120 



FIG.3E 
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GGnCTAGGAMGACCCTrrCAGTTCGGGTAGATAAAGGTGTGGGTCACnCGTAACCTT 
KILSGKVKPIYFHTQ 

ATCCCTATTTCCTCACCCCAGCTCATGCCCCCTTTCAGATGTCTTCTGCCTGTTA 

3121 + + + + + 3175 

TAG6GATAAAGGAGTGGGGTCGAGTACG6GGGAAAGTCTACAGAAGACGGACAAT 



FIG.3F 
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SEQUENCE LISTING 

<110> Merck & Co., Inc. 

<120> DNA molecules encoding Macaca Mulatta 
^ androgen receptor 

<130> 20736 PCT 

<150> 60/289,573 
<151> 2001-05-08 

<160> 12 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 3175 
<212> DNA 

<213> Macaca mulatta 



<400> 1 



cccaaaaaat 


aaaaacaaac 


aaaaacaaaa 


caaaacaaaa 


aaaacgaata 


aagaaaaagg 


60 


taataactca 


gttcttattt 


gcacctactt 


ccagtggaca 


ctgaatttgg 


aaggtggagg 


120 


attcttgttt 


tttcttttaa 


gatcgggcat 


cttttgaatc 


tacccctcaa 


gtgttaagag 


180 


acagactgtg 


agcctagcag 


ggcagatctt 


gtccaccgtg 


tgtcttcttt 


tgcaggagac 


240 


tttgaggctg 


tcagagcgct 


ttttgcgtgg 


ttgctcccgc 


aagtttcctt 


ctctggagct 


300 


tcccgcaggt 


gggcagctag 


ctgcagcgac 


taccgcatca 


tcacagcctg 


ttgaactctt 


360 


ctgagcaaga 


gaaggggagg 


cggggtaagg 


gaagtaggtg 


gaagattcag 


ccaagctcaa 


420 


ggatggaggt 


gcagttaggg 


ctggggaggg 


tctaccctcg 


gccgccgtcc 


aagacctacc 


480 


gaggagcttt 


ccagaatctg 


ttccagagcg 


tgcgcgaagt 


gatccagaac 


ccgggcccca 


540 


ggcacccaga 


ggccgcgagc 


gcagcacctc 


ccggcgccag 


tttgcagcag 


cagcagcagc 


600 


agcagcaaga 


aactagcccc 


cggcaacagc 


agcagcagca 


gcagggtgag 


gatggttctc 


660 


cccaagccca 


tcgtagaggc 


cccacaggct 


acctggtcct 


ggatgaggaa 


cagcagcctt 


720 
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cacagcctca 


gtcagccccg 


gagtgccacc 


ccgagagagg 


ttgcgtccca 


gagcctggag 


780 


ccgccgtggc 


cgccggcaag 


gggctgccgc 


agcagctgcc 


agcacctccg 


gacgaggatg 


840 


actcagctgc 


cccatccacg 


ttgtctctgc 


tgggccccac 


tttccccggc 


ttaagcagct 


900 


gctccgccga 


ccttaaagac 


atcctgagcg 


aggccagcac 


catgcaactc 


cttcagcaac 


960 


agcagcagga 


agcagtatcc 


gaaggcagca 


gcagcgggag 


agcgagggag 


gcctcggggg 


1020 


ctcccacttc 


ctccaaggac 


aattacttag 


agggcacttc 


gaccatttct 


gacagcgcca 


1080 


aggagctgtg 


taaggcagtg 


tcggtgtcca 


tgggcttggg 


tgtggaggcg 


ttggagcatc 


1140 


tgagtccagg 


ggaacagctt 


cggggggatt 


gcatgtacgc 


cccagttttg 


ggagttccac 


1200 


ccgctgtgcg 


tcccactccg 


tgtgccccat 


tggccgaatg 


caaaggttct 


ctgctagacg 


1260 


acagcgcagg 


caagagcact 


gaagatactg 


ctgagtattc 


ccctttcaag 


ggaggttaca 


1320 


ccaaagggct 


agaaggcgag 


agcctaggct 


gctctggcag 


cgctgcagca 


gggagctccg 


1380 


ggacacttga 


actgccgtcc 


accctgtctc 


tctacaagtc 


cggagcactg 


gacgaggcag 


1440 


ctgcgtacca 


gagtcgcgac 


tactacaact 


ttccactggc 


tctggccggg 


ccgccgcccc 


1500 


ctccaccgcc 


tccccatccc 


cacgctcgca 


tcaagctgga 


gaacccgctg 


gactatggca 


1560 


gcgcctgggc 


ggctgcggcg 


gcgcagtgcc 


gctatgggga 


cctggcgagc 


ctgcatggcg 


1620 


cgggtgcagc 


gggacccggc 


tctgggtcac 


cctcagcggc 


cgcttcctca 


tcctggcaca 


1680 


ctctcttcac 


agccgaagaa 


ggccagttgt 


atggaccgtg 


tggtggtggg 


ggcggcggcg 


1740 


gtggcggcgg 


cggcggcggc 


gcaggcgagg 


cgggagctgt 


agccccctac 


ggctacactc 


1800 


ggccacctca 


ggggctggcg 


ggccaggaag 


gcgacttcac 


cgcacctgat 


gtgtggtacc 


1860 


ctggcggcat 


ggtgagcaga 


gtgccctatc 


ccagtcccac 


ttgtgtcaaa 


agcgagatgg 


1920 


gcccctggat 


ggatagctac 


tccggacctt 


acggggacat 


gcgtttggag 


actgccaggg 


1980 


accatgtttt 


gccaattgac 


tattactttc 


caccccagaa 


gacctgcctg 


atctgtggag 


2040 


atgaagcttc 


tgggtgtcac 


tatggagctc 


tcacatgtgg 


aagctgcaag 


gtcttcthca 


2100 


aaagagccgc 


tgaagggaaa 


cagaagtacc 


tgtgtgccag 


cagaaatgat 


tgcactattg 


2160 


ataaattccg 


aaggaaaaat 


tgtccatctt 


gccgtcttcg 


gaaatgttat 


gaagcaggga 


2220 


tgactctggg 


agcccggaag 


ctgaagaaac 


ttggtaatct 


gaaactacag 


gaggaaggag 


2280 


aggcttccag 


caccaccagc 


cccactgagg 


agacagccca 


gaagctgaca 


gtgtcacaca 


2340 


ttgaaggcta 


tgaatgtcag 


cccatctttc 


tgaatgtcct 


ggaggccatt 


gagccaggtg 


2400 


tggtgtgtgc 


tggacatgac 


aacaaccagc 


ccgactcctt 


cgcagccttg 


ctctctagcc 


2460 


tcaatgaact 


gggagagaga 


cagcttgtac 


atgtggtcaa 


gtgggccaag 


gccttgcctg 


2520 


gcttccgcaa 


cttacacgtg 


gacgaccaga 


tggctgtcat 


tcagtactcc 


tggatggggc 


2580 


tcatggtgtt 


tgccatgggc 


tggcgatcct 


tcaccaatgt 


caactccagg 


atgctctact 


2640 


ttgcccctga 


tctggttttc 


aatgagtacc 


gcatgcacaa 


atcccggatg 


tacagccagt 


2700 


gtgtccgaat 


gaggcacctc 


tctcaagagt 


ttggatggct 


ccaaatcacc 


ccccaggaat 


2760 


tcctgtgcat 


gaaagcgctg 


ctactcttca 


gcattattcc 


agtggatggg 


ctgaaaaatc 


2820 
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aaaaattctt 


tgatgaactt 


cgaatgaact 


acatcaagga 


actcgatcgt 


atcattgcat 


2880 


gcaaaagaaa 


aaatcccaca 


tcctgctcaa 


ggcgtttcta 


ccagctcacc 


aagctcctgg 


2940 


actccgtgca 


gcctattgcg 


agagagctgc 


atcagttcac 


ttttgacctg ctaatcaagt 


3000 


cacacatggt 


gagcgtggac 


tttccggaaa 


tgatggcaga 


gatcatctct 


gtgcaagtgc 


3060 


ccaagatcct 


ttctgggaaa 


gtcaagccca 


tctatttcca 


cacccagtga 


agcattggaa 


3120 


atccctattt 


cctcacccca 


gctcatgccc 


cctttcagat 


gtcttctgcc 


tgtta 


3175 



<210> 2 
<211> 895 
<212> PRT 

<213> Macaca mulatta 
<400> 2 

Met Glu Val Gin Leu Gly Leu Gly Arg Val Tyr Pro Arg Pro Pro Ser 

15 10 15 

Lys Thr Tyr Arg Gly Ala Phe Gin Asn Leu Phe Gin Ser Val Arg Glu 

20 25 30 

Val He Gin Asn Pro Gly Pro Arg His Pro Glu Ala Ala Ser Ala Ala 

35 40 45 

Pro Pro Gly Ala Ser Leu Gin Gin Gin Gin Gin Gin Gin Gin Glu Thr 

50 55 60 

Ser Pro Arg Gin Gin Gin Gin Gin Gin Gin Gly Glu Asp Gly Ser Pro 
65 70 75 80 

Gin Ala His Arg Arg Gly Pro Thr Gly Tyr Leu Val Leu Asp Glu Glu 

85 90 95 

Gin Gin Pro Ser Gin Pro Gin Ser Ala Pro Glu Cys His Pro Glu Arg 

100 105 110 

Gly Cys Val Pro Glu Pro Gly Ala Ala Val Ala Ala Gly Lys Gly Leu 

115 120 125 

Pro Gin Gin Leu Pro Ala Pro Pro Asp Glu Asp Asp Ser Ala Ala Pro 

130 135 140 

Ser Thr Leu Ser Leu Leu Gly Pro Thr Phe Pro Gly Leu Ser Ser Cys 
145 150 155 160 

Ser Ala Asp Leu Lys Asp He Leu Ser Glu Ala Ser Thr Met Gin Leu 
165 170 175 
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Leu Gin Gin Gin Gin Gin Glu Ala Val Ser Glu Gly Ser Ser Ser Gly 

180 185 190 

Arg Ala Arg Glu Ala Ser Gly Ala Pro Thr Ser Ser Lys Asp Asn Tyr 

195 200 205 

Leu Glu Gly Thr Ser Thr lie Ser Asp Ser Ala Lys Glu Leu Cys Lys 

210 215 220 

Ala Val Ser Val Ser Met Gly Leu Gly Val Glu Ala Leu Glu His Leu 
225 230 235 240 

Ser Pro Gly Glu Gin Leu Arg Gly Asp Cys Met Tyr Ala Pro Val Leu 

245 250 255 

Gly Val Pro Pro Ala Val Arg Pro Thr Pro Cys Ala Pro Leu Ala Glu 

260 265 270 

Cys Lys Gly Ser Leu Leu Asp Asp Ser Ala Gly Lys Ser Thr Glu Asp 

275 280 285 

Thr Ala Glu Tyr Ser Pro Phe Lys Gly Gly Tyr Thr Lys Gly Leu Glu 

290 295 300 

Gly Glu Ser Leu Gly Cys Ser Gly Ser Ala Ala Ala Gly Ser Ser Gly 
305 310 315 320 

Thr Leu Glu Leu Pro Ser Thr Leu Ser Leu Tyr Lys Ser Gly Ala Leu 

325 330 335 

Asp Glu Ala Ala Ala Tyr Gin Ser Arg Asp Tyr Tyr Asn Phe Pro Leu 

340 345 350 

Ala Leu Ala Gly Pro Pro Pro Pro Pro Pro Pro Pro His Pro His Ala 

355 360 365 

Arg lie Lys Leu Glu Asn Pro Leu Asp Tyr Gly Ser Ala Trp Ala Ala 

370 375 380 

Ala Ala Ala Gin Cys Arg Tyr Gly Asp Leu Ala Ser Leu His Gly Ala 
385 390 395 400 

Gly Ala Ala Gly Pro Gly Ser Gly Ser Pro Ser Ala Ala Ala Ser Ser 

405 410 415 

Ser Trp His Thr Leu Phe Thr Ala Glu Glu Gly Gin Leu Tyr Gly Pro 

420 425 430 

Cys Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Ala Gly 

435 440 445 

Glu Ala Gly Ala Val Ala Pro Tyr Gly Tyr Thr Arg Pro Pro Gin Gly 
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450 455 460 

Leu Ala Gly Gin Glu Gly Asp Phe Thr Ala Pro Asp Val Trp Tyr Pro 
465 470 475 480 

Gly Gly Met Val Ser Arg Val Pro Tyr Pro Ser Pro Thr Cys Val Lys 

485 490 495 

Ser Glu Met Gly Pro Trp Met Asp Ser Tyr Ser Gly Pro Tyr Gly Asp 

500 505 510 

Met Arg Leu Glu Thr Ala Arg Asp His Val Leu Pro lie Asp Tyr Tyr 

515 520 525 

Phe Pro Pro Gin Lys Thr Cys Leu lie Cys Gly Asp Glu Ala Ser Gly 

530 535 540 

Cys His Tyr Gly Ala Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys 
545 550 555 560 

Arg Ala Ala Glu Gly Lys Gin Lys Tyr Leu Cys Ala Ser Arg Asn Asp 

565 570 575 

Cys Thr lie Asp Lys Phe Arg Arg Lys Asn Cys Pro Ser Cys Arg Leu 

580 585 590 

Arg Lys Cys Tyr Glu Ala Gly Met Thr Leu Gly Ala Arg Lys Leu Lys 

595 600 605 

Lys Leu Gly Asn Leu Lys Leu Gin Glu Glu Gly Glu Ala Ser Ser Thr 

610 615 620 

Thr Ser Pro Thr Glu Glu Thr Ala Gin Lys Leu Thr Val Ser His lie 
625 630 635 640 

Glu Gly Tyr Glu Cys Gin Pro lie Phe Leu Asn Val Leu Glu Ala lie 

645 650 655 

Glu Pro Gly Val Val Cys Ala Gly His Asp Asn Asn Gin Pro Asp Ser 

660 665 670 

Phe Ala Ala Leu Leu Ser Ser Leu Asn Glu Leu Gly Glu Arg Gin Leu 

675 680 685 

Val His Val Val Lys Trp Ala Lys Ala Leu Pro Gly Phe Arg Asn Leu 

690 695 700 

His Val Asp Asp Gin Met Ala Val lie Gin Tyr Ser Trp Met Gly Leu 
705 710 715 720 

Met Val Phe Ala Met Gly Trp Arg Ser Phe Thr Asn Val Asn Ser Arg 
725 730 735 
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Met Leu Tyr Phe Ala Pro Asp Leu Val Phe Asn Glu Tyr Arg Met His 

740 745 750 

Lys Ser Arg Met Tyr Ser Gin Cys Val Arg Met Arg His Leu Ser Gin 

755 760 765 

Glu Phe Gly Trp Leu Gin lie Thr Pro Gin Glu Phe Leu Cys Met Lys 

770 775 780 

Ala Leu Leu Leu Phe Ser He He Pro Val Asp Gly Leu Lys Asn Gin 
785 790 795 800 

Lys Phe Phe Asp Glu Leu Arg Met Asn Tyr He Lys Glu Leu Asp Arg 

805 810 815 

He He Ala Cys Lys Arg Lys Asn Pro Thr Ser Cys Ser Arg Arg Phe 

820 825 830 

Tyr Gin Leu Thr Lys Leu Leu Asp Ser Val Gin Pro He Ala Arg Glu 

835 840 845 

Leu His Gin Phe Thr Phe Asp Leu Leu He Lys Ser His Met Val Ser 

850 855 860 

Val Asp Phe Pro Glu Met Met Ala Glu He He Ser Val Gin Val Pro 
865 870 875 880 

Lys He Leu Ser Gly Lys Val Lys Pro He Tyr Phe His Thr Gin 
885 890 895 

<210> 3 
<211> 3175 
<212> DNA 

<213> Macaca mulatta 



<400> 3 



cccaaaaaat 


aaaaacaaac 


aaaaacaaaa 


caaaacaaaa 


aaaacgaata 


aagaaaaagg 


60 


taataactca 


gttcttattt 


gcacctactt 


ccagtggaca 


ctgaatttgg 


aaggtggagg 


120 


attcttgttt 


tttcttttaa 


gatcgggcat 


cttttgaatc 


tacccctcaa 


gtgttaagag 


180 


acagactgtg agcctagcag ggcagatctt 


gtccaccgtg 


tgtcttcttt 


tgcaggagac 


240 


tttgaggctg 


tcagagcgct 


ttttgcgtgg 


ttgctcccgc 


aagtttcctt 


ctctggagct 


300 


tcccgcaggt 


gggcagctag 


ctgcagcgac 


taccgcatca 


tcacagcctg 


ttgaactctt 


360 


ctgagcaaga 


gaaggggagg 


cggggtaagg 


gaagtaggtg 


gaagattcag 


ccaagctcaa 


420 


ggatggaggt 


gcagttaggg 


ctggggaggg 


tctaccctcg 


gccgccgtcc 


aagacctacc 


480 
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gaggagcttt 


ccagaatctg 


ttccagagcg 


tgcgcgaagt 


gatccagaac 


ccgggcccca 


540 


ggcacccaga 


ggccgcgagc 


gcagcacctc 


ccggcgccag 


tttgcagcag 


cagcagcagc 


600 


agcagcaaga 


aactagcccc 


cggcaacagc 


agcagcagca 


gcagggtgag 


gatggttctc 


660 


cccaagccca 


tcgtagaggc 


cccacaggct 


acctggtcct 


ggatgaggaa 


cagcagcctt 


720 


cacagcctca 


gtcagccccg 


gagtgccacc 


ccgagagagg 


ttgcgtccca 


gagcctggag 


780 


ccgccgtggc 


cgccggcaag 


gggctgccgc 


agcagctgcc 


agcacctccg 


gacgaggatg 


840 


actcagctgc 


cccatccacg 


ttgtctctgc 


tgggccccac 


tttccccggc 


ttaagcagct 


900 


gctccgccga 


ccttaaagac 


atcctgagcg 


aggccagcac 


catgcaactc 


cttcagcaac 


960 


agcagcagga 


agcagtatcc 


gaaggcagca 


gcagcgggag 


agcgagggag 


gcctcggggg 


1020 


ctcccacttc 


ctccaaggac 


aattacttag 


ggggcacttc 


gaccatttct 


gacagcgcca 


1080 


aggagctgtg 


taaggcagtg 


tcggtgtcca 


tgggcttggg 


tgtggaggcg 


ttggagcatc 


1140 


tgagtccagg 


ggaacagctt 


cggggggatt 


gcatgtacgc 


cccagttttg 


ggagttccac 


1200 


ccgctgtgcg 


tcccactccg 


tgtgccccat 


tggccgaatg 


caaaggttct 


ctgctagacg 


1260 


acagcgcagg 


caagagcact 


gaagatactg 


ctgagtattc 


ccctttcaag 


ggaggttaca 


1320 


ccaaagggct 


agaaggcgag 


agcctaggct 


gctctggcag 


cgctgcagca 


gggagctccg 


1380 


ggacacttga 


actgccgtcc 


accctgtctc 


tctacaagtc 


cggagcactg 


gacgaggcag 


1440 


ctgcgtacca 


gagtcgcgac 


tactacaact 


ttccactggc 


tctggccggg 


ccgccgcccc 


1500 


ctccaccgcc 


tccccatccc 


cacgctcgca 


tcaagctgga 


gaacccgctg 


gactatggca 


1560 


gcgcctgggc 


ggctgcggcg 


gcgcagtgcc 


gctatgggga 


cctggcgagc 


ctgcatggcg 


1620 


cgggtgcagc 


gggacccggc 


tctgggtcac 


cctcagcggc 


cgcttcctca 


tcctggcaca 


1680 


ctctcttcac 


agccgaagaa 


ggccagttgt 


atggaccgtg 


tggtggtggg 


ggcggcggcg 


1740 


gtggcggcgg 


cggcggcggc 


gcaggcgagg 


cgggagctgt 


agccccctac 


ggctacactc 


1800 


ggccacctca 


ggggctggcg 


ggccaggaag 


gcgacttcac 


cgcacctgat 


gtgtggtacc 


1860 


ctggcggcat 


ggtgagcaga 


gtgccctatc 


ccagtcccac 


ttgtgtcaaa 


agcgagatgg 


1920 


gcccctggat 


ggatagctac 


tccggacctt 


acggggacat 


gcgtttggag 


actgccaggg 


1980 


accatgtttt 


gccaattgac 


tattactttc 


caccccagaa 


gacctgcctg 


atctgtggag 


2040 


atgaagcttc 


tgggtgtcac 


tatggagctc 


tcacatgtgg 


aagctgcaag 


gtcttcttca 


2100 


aaagagccgc 


tgaagggaaa 


cagaagtacc 


tgtgtgccag 


cagaaatgat 


tgcactattg 


2160 


ataaattccg 


aaggaaaaat 


tgtccatctt 


gccgtcttcg 


gaaatgttat 


gaagcaggga 


2220 


tgactctggg 


agcccggaag 


ctgaagaaac 


ttggtaatct 


gaaactacag 


gaggaaggag 


2280 


aggcttccag 


caccaccagc 


cccactgagg 


agacagccca 


gaagctgaca 


gtgtcacaca 


2340 


ttgaaggcta 


tgaatgtcag 


cccatctttc 


tgaatgtcct 


ggaggccatt 


gagccaggtg 


2400 


tggtgtgtgc 


tggacatgac 


aacaaccagc 


ccgactcctt 


cgcagccttg 


ctctctagcc 


2460 


tcaatgaact 


gggagagaga 


cagcttgtac 


atgtggtcaa 


gtgggccaag 


gccttgcctg 


2520 


gcttccgcaa 


cttacacgtg 


gacgaccaga 


tggctgtcat 


tcagtactcc 


tggatggggc 


2580 
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tcatggtgtt 


tgccatgggc 


tggcgatcct 


tcaccaatgt 


caactccagg 


atgctctact 


2640 


ttgcccctga 


tctggttttc 


aatgagtacc 


gcatgcacaa 


atcccggatg 


tacagccagt 


2700 


gtgtccgaat 


gaggcacctc 


tctcaagagt 


ttggatggct 


ccaaatcacc 


ccccaggaat 


2760 


tcctgtgcat 


gaaagcgctg 


ctactcttca 


gcattattcc 


agtggatggg 


ctgaaaaatc 


2820 


aaaaattctt 


tgatgaactt 


cgaatgaact 


acatcaagga 


actcgatcgt 


atcattgcat 


2880 


gcaaaagaaa 


aaatcccaca 


tcctgctcaa 


ggcgtttcta 


ccagctcacc 


aagctcctgg 


2940 


actccgtgca 


gcctattgcg 


agagagctgc 


atcagttcac 


ttttgacctg 


ctaatcaagt 


3000 


cacacatggt 


gagcgtggac 


tttccggaaa 


tgatggcaga 


gatcatctct 


gtgcaagtgc 


3060 


ccaagatcct 


ttctgggaaa 


gtcaagccca 


tctatttcca 


cacccagtga 


agcattggaa 


3120 


atccctattt 


cctcacccca 


gctcatgccc 


cctttcagat 


gtcttctgcc 


tgtta 


3175 



<210> 4 
<211> 895 
<212> PRT 

<213> Macaca mulatta 
<400> 4 

Met Glu Val Gin Leu 

1 5 
Lys Thr Tyr Arg Gly 
20 

Val lie Gin Asn Pro 
35 

Pro Pro Gly Ala Ser 
50 

Ser Pro Arg Gin Gin 
65 

Gin Ala His Arg Arg 
85 

Gin Gin Pro Ser Gin 
100 

Gly Cys Val Pro Glu 
115 

Pro Gin Gin Leu Pro 
130 



Gly Leu Gly Arg Val Tyr Pro Arg Pro Pro Ser 

10 15 
Ala Phe Gin Asn Leu Phe Gin Ser Val Arg Glu 

25 30 
Gly Pro Arg His Pro Glu Ala Ala Ser Ala Ala 

40 45 
Leu Gin Gin Gin Gin Gin Gin Gin Gin Glu Thr 

55 60 
Gin Gin Gin Gin Gin Gly Glu Asp Gly Ser Pro 
70 75 80 

Gly Pro Thr Gly Tyr Leu Val Leu Asp Glu Glu 

90 95 
Pro Gin Ser Ala Pro Glu Cys His Pro Glu Arg 

105 110 
Pro Gly Ala Ala Val Ala Ala Gly Lys Gly Leu 

120 125 
Ala Pro Pro Asp Glu Asp Asp Ser Ala Ala Pro 
135 140 
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Ser Thr Leu Ser Leu Leu Gly Pro Thr Phe Pro Gly Leu Ser Ser Cys 
145 150 155 160 

Ser Ala Asp Leu Lys Asp lie Leu Ser Glu Ala Ser Thr Met Gin Leu 

165 170 175 

Leu Gin Gin Gin Gin Gin Glu Ala Val Ser Glu Gly Ser Ser Ser Gly 

180 185 190 

Arg Ala Arg Glu Ala Ser Gly Ala Pro Thr Ser Ser Lys Asp Asn Tyr 

195 200 205 

Leu Gly Gly Thr Ser Thr lie Ser Asp Ser Ala Lys Glu Leu Cys Lys 

210 215 220 

Ala Val Ser Val Ser Met Gly Leu Gly Val Glu Ala Leu Glu His Leu 
225 230 235 240 

Ser Pro Gly Glu Gin Leu Arg Gly Asp Cys Met Tyr Ala Pro Val Leu 

245 250 255 

Gly Val Pro Pro Ala Val Arg Pro Thr Pro Cys Ala Pro Leu Ala Glu 

260 265 270 

Cys Lys Gly Ser Leu Leu Asp Asp Ser Ala Gly Lys Ser Thr Glu Asp 

275 280 285 

Thr Ala Glu Tyr Ser Pro Phe Lys Gly Gly Tyr Thr Lys Gly Leu Glu 

290 295 300 

Gly Glu Ser Leu Gly Cys Ser Gly Ser Ala Ala Ala Gly Ser Ser Gly 
305 310 315 320 

Thr Leu Glu Leu Pro Ser Thr Leu Ser Leu Tyr Lys Ser Gly Ala Leu 

325 330 335 

Asp Glu Ala Ala Ala Tyr Gin Ser Arg Asp Tyr Tyr Asn Phe Pro Leu 

340 345 350 

Ala Leu Ala Gly Pro Pro Pro Pro Pro Pro Pro Pro His Pro His Ala 

355 360 365 

Arg He Lys Leu Glu Asn Pro Leu Asp Tyr Gly Ser Ala Trp Ala Ala 

370 375 380 

Ala Ala Ala Gin Cys Arg Tyr Gly Asp Leu Ala Ser Leu His Gly Ala 
385 390 395 400 

Gly Ala Ala Gly Pro Gly Ser Gly Ser Pro Ser Ala Ala Ala Ser Ser 

405 410 415 

Ser Trp His Thr Leu Phe Thr Ala Glu Glu Gly Gin Leu Tyr Gly Pro 
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420 425 430 

Cys Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Ala Gly 

435 440 445 

Glu Ala Gly Ala Val Ala Pro Tyr Gly Tyr Thr Arg Pro Pro Gin Gly 

450 455 460 

Leu Ala Gly Gin Glu Gly Asp Phe Thr Ala Pro Asp Val Trp Tyr Pro 
465 470 475 480 

Gly Gly Met Val Ser Arg Val Pro Tyr Pro Ser Pro Thr Cys Val Lys 

485 490 495 

Ser Glu Met Gly Pro Trp Met Asp Ser Tyr Ser Gly Pro Tyr Gly Asp 

500 505 510 

Met Arg Leu Glu Thr Ala Arg Asp His Val Leu Pro He Asp Tyr Tyr 

515 520 525 

Phe Pro Pro Gin Lys Thr Cys Leu He Cys Gly Asp Glu Ala Ser Gly 

530 535 540 

Cys His Tyr Gly Ala Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys 
545 550 555 560 

Arg Ala Ala Glu Gly Lys Gin Lys Tyr Leu Cys Ala Ser Arg Asn Asp 

565 570 575 

Cys Thr He Asp Lys Phe Arg Arg Lys Asn Cys Pro Ser Cys Arg Leu 

580 585 590 

Arg Lys Cys Tyr Glu Ala Gly Met Thr Leu Gly Ala Arg Lys Leu Lys 

595 600 605 

Lys Leu Gly Asn Leu Lys Leu Gin Glu Glu Gly Glu Ala Ser Ser Thr 

610 615 620 

Thr Ser Pro Thr Glu Glu Thr Ala Gin Lys Leu Thr Val Ser His He 
625 630 635 640 

Glu Gly Tyr Glu Cys Gin Pro He Phe Leu Asn Val Leu Glu Ala He 

645 650 655 

Glu Pro Gly Val Val Cys Ala Gly His Asp Asn Asn Gin Pro Asp Ser 

660 665 670 

Phe Ala Ala Leu Leu Ser Ser Leu Asn Glu Leu Gly Glu Arg Gin Leu 

675 680 685 

Val His Val Val Lys Trp Ala Lys Ala Leu Pro Gly Phe Arg Asn Leu 
690 695 700 
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His Val Asp Asp 


Gin Met Ala Val He 


Gin 


Tyr 


Ser 


Trp 


Met 


Gly 


Leu 




705 


710 




715 










720 




Met Val Phe Ala 


Met Gly Trp Arg Ser 


Phe 


Thr 


Asn 


Val 


Asn 


Ser 


Arg 






725 


730 










735 






Met Leu Tyr Phe 


Ala Pro Asp Leu Val 


Phe 


Asn 


Glu 


Tyr 


Arg 


Met 


His 




740 


745 










750 








Lys Ser Arg Met 


Tyr Ser Gin Cys Val 


Arg 


Met 


Arg 


His 


Leu 


Ser 


Gin 




755 


760 








765 










Glu Phe Gly Trp 


Leu Gin He Thr Pro 


Gin 


Glu 


Phe 


Leu 


Cys 


Met 


Lys 




770 


775 






780 












Ala Leu Leu Leu 


Phe Ser He He Pro 


Val 


Asp 


Gly 


Leu 


Lys 


Asn 


Gin 




785 


790 




795 










800 




Lys Phe Phe Asp 


Glu Leu Arg Met Asn Tyr 


He 


Lys 


Glu 


Leu 


Asp 


Arg 






805 


o 1 rv 










815 






He He Ala Cys 


Lys Arg Lys Asn Pro 


Thr 


Ser 


Cys 


Ser 


Arg 


Arg 


Phe 




820 


825 










830 








Tyr Gin Leu Thr 


Lys Leu Leu Asp Ser 


Val 


Gin 


Pro 


He 


Ala 


Arg 


Glu 




835 


840 








845 










Leu His Gin Phe 


Thr Phe Asp Leu Leu 


He 


Lys 


Ser 


His 


Met 


Val 


Ser 




850 


855 






860 












Val Asp Phe Pro 


Glu Met Met Ala Glu 


He 


He 


Ser 


Val 


Gin 


Val 


Pro 




865 


870 




875 










880 




Lys He Leu Ser 


Gly Lys Val Lys Pro 


He 


Tyr 


Phe 


His 


Thr 


Gin 








885 


890 










895 






<210> 5 




















<211> 3175 




















<212> DNA 




















<213> Macaca mulatta 


















<400> 5 




















gggtttttta tttttgtttg tttttgtttt gttttgtttt 


ttttgcttat ttctttttcc 


60 


attattgagt caagaataaa cgtggatgaa ggtcacctgt 


gacttaaacc ttccacctcc 


120 


taagaacaaa aaagaaaatt ctagcccgta gaaaacttag 


atggggagtt cacaattctc 


180 


tgtctgacac tcggatcgtc ccgtctagaa caggtggcac 


acagaagaaa acgtcctctg 


240 
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aaactccgac 


agtctcgcga 


aaaacgcacc 


aacgagggcg 


ttcaaaggaa 


gagacctcga 


300 


agggcgtcca 


cccgtcgatc 


gacgtcgctg 


atggcgtagt 


agtgtcggac 


aacttgagaa 


360 


gactcgttct 


cttcccctcc 


gccccattcc 


cttcatccac 


cttctaagtc 


ggttcgagtt 


420 


cctacctcca 


cgtcaatccc 


gacccctccc 


agatgggagc 


cggcggcagg 


ttctggatgg 


480 


ctcctcgaaa 


ggtcttagac 


aaggtctcgc 


acgcgcttca 


ctaggtcttg 


ggcccggggt 


540 


ccgtgggtct 


ccggcgctcg 


cgtcgtggag 


ggccgcggtc 


aaacgtcgtc 


gtcgtcgtcg 


600 


tcgtcgttct 


ttgatcgggg 


gccgttgtcg 


tcgtcgtcgt 


cgtcccactc 


ctaccaagag 


660 


gggttcgggt 


agcatctccg 


gggtgtccga 


tggaccagga 


cctactcctt 


gtcgtcggaa 


720 


gtgtcggagt 


cagtcggggc 


ctcacggtgg 


ggctctctcc 


aacgcagggt 


ctcggacctc 


780 


ggcggcaccg 


gcggccgttc 


cccgacggcg 


tcgtcgacgg 


tcgtggaggc 


ctgctcctac 


840 


tgagtcgacg 


gggtaggtgc 


aacagagacg 


acccggggtg 


aaaggggccg 


aattcgtcga 


900 


cgaggcggct 


ggaatttctg 


taggactcgc 


tccggtcgtg 


gtacgttgag 


gaagtcgttg 


960 


tcgtcgtcct 


tcgtcatagg 


cttccgtcgt 


cgtcgccctc 


tcgctccctc 


cggagccccc 


1020 


gagggtgaag 


gaggttcctg 


ttaatgaatc 


tcccgtgaag 


ctggtaaaga 


ctgtcgcggt 


1080 


tcctcgacac 


attccgtcac 


agccacaggt 


acccgaaccc 


acacctccgc 


aacctcgtag 


1140 


actcaggtcc 


ccttgtcgaa 


gcccccctaa 


cgtacatgcg 


gggtcaaaac 


cctcaaggtg 


1200 


ggcgacacgc 


agggtgaggc 


acacggggta 


accggcttac 


gtttccaaga 


gacgatctgc 


1260 


tgtcgcgtcc 


gttctcgtga 


cttctatgac 


gactcataag 


gggaaagttc 


cctccaatgt 


1320 


ggtttcccga 


tcttccgctc 


tcggatccga 


cgagaccgtc 


gcgacgtcgt 


ccctcgaggc 


1380 


cctgtgaact 


tgacggcagg 


tgggacagag 


agatgttcag 


gcctcgtgac 


ctgctccgtc 


1440 


gacgcatggt 


ctcagcgctg 


atgatgttga 


aaggtgaccg 


agaccggccc 


ggcggcgggg 


1500 


gaggtggcgg 


aggggtaggg 


gtgcgagcgt 


agttcgacct 


cttgggcgac 


ctgataccgt 


1560 


cgcggacccg 


ccgacgccgc 


cgcgtcacgg 


cgatacccct 


ggaccgctcg 


gacgtaccgc 


1620 


gcccacgtcg 


ccctgggccg 


agacccagtg 


ggagtcgccg 


gcgaaggagt 


aggaccgtgt 


1680 


gagagaagtg 


tcggcttctt 


ccggtcaaca 


tacctggcac 


accaccaccc 


ccgccgccgc 


1740 


caccgccgcc 


gccgccgccg 


cgtccgctcc 


gccctcgaca 


tcgggggatg 


ccgatgtgag 


1800 


ccggtggagt 


ccccgaccgc 


ccggtccttc 


cgctgaagtg 


gcgtggacta 


cacaccatgg 


1860 


gaccgccgta 


ccactcgtct 


cacgggatag 


ggtcagggtg 


aacacagttt 


tcgctctacc 


1920 


cggggaccta 


cctatcgatg 


aggcctggaa 


tgcccctgta 


cgcaaacctc 


tgacggtccc 


1980 


tggtacaaaa 


cggttaactg 


ataatgaaag 


gtggggtctt 


ctggacggac 


tagacacctc 


2040 


tacttcgaag 


acccacagtg 


atacctcgag 


agtgtacacc 


ttcgacgttc 


cagaagaagt 


2100 


tttctcggcg 


acttcccttt 


gtcttcatgg 


acacacggtc 


gtctttacta 


acgtgataac 


2160 


tatttaaggc 


ttccttttta 


acaggtagaa 


cggcagaagc 


ctttacaata 


cttcgtccct 


2220 


actgagaccc 


tcgggccttc 


gacttctttg 


aaccattaga 


ctttgatgtc 


ctccttcctc 


2280 


tccgaaggtc 


gtggtggtcg 


gggtgactcc 


tctgtcgggt 


cttcgactgt 


cacagtgtgt 


2340 
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aacttccgat 


acttacagtc 


gggtagaaag 


acttacagga cctccggtaa 


ctcggtccac 


2400 


accacacacg 


acctgtactg 


ttgttggtcg ggctgaggaa gcgtcggaac 


gagagatcgg 


2460 


agttacttga 


ccctctctct 


gtcgaacatg 


tacaccagtt cacccggttc 


cggaacggac 


2520 


cgaaggcgtt 


gaatgtgcac 


ctgctggtct accgacagta agtcatgagg 


acctaccccg 


2580 


agtaccacaa 


acggtacccg 


accgctagga 


agtggttaca gttgaggtcc 


tacgagatga 


2640 


aacggggact 


agaccaaaag 


ttactcatgg cgtacgtgtt tagggcctac 


atgtcggtca 


2700 


cacaggctta 


ctccgtggag 


agagttctca 


aacctaccga ggtttagtgg 


ggggtcctta 


2760 


aggacacgta 


ctttcgcgac 


gatgagaagt 


cgtaataagg tcacctaccc 


gactttttag 


2820 


tttttaagaa 


actacttgaa 


gcttacttga 


tgtagttcct tgagctagca 


tagtaacgta 


2880 


cgttttcttt 


tttagggtgt 


aggacgagtt 


ccgcaaagat ggtcgagtgg 


ttcgaggacc 


2940 


tgaggcacgt 


cggataacgc 


tctctcgacg 


tagtcaagtg aaaactggac 


gattagttca 


3000 


gtgtgtacca 


ctcgcacctg 


aaaggccttt 


actaccgtct ctagtagaga 


cacgttcacg 


3060 


ggttctagga 


aagacccttt 


cagttcgggt 


agataaaggt gtgggtcact 


tcgtaacctt 


3120 


tagggataaa 


ggagtggggt 


cgagtacggg 


ggaaagtcta cagaagacgg 


acaat 


3175 



<210> 6 

<211> 2821 

<212> DNA 

<213> Macaca fascicularis 



<400> 6 



atggaggtgc 


agttagggct 


ggggagggtc 


taccctcggc 


cgccgtccaa 


gacctaccga 


60 


ggagctttcc 


agaatctgtt 


ccagagcgtg 


cgcgaagtga 


tccagaaccc 


gggccccagg 


120 


cacccagagg 


ccgcgagcgc 


agcacctccc 


ggcgccagtt 


tgcagcagca 


gcagcagcag 


180 


cagcaagaaa 


ctagcccccg 


gcaacagcag 


cagcagcagc 


agggtgagga 


tggttctccc 


240 


caagcccatc 


gtagaggccc 


cacaggctac 


ctggtcctgg 


atgaggaaca 


gcagccttca 


300 


cagcctcagt 


cagccccgga 


gtgccacccc 


gagagaggtt 


gcgtcccaga 


gcctggagcc 


360 


gccgtggccg 


ccggcaaggg 


gctgccgcag 


cagctgccag 


cacctccgga 


cgaggatgac 


420 


tcagctgccc 


catccacgtt 


gtctctgctg 


ggccccactt 


tccccggctt 


aagcagctgc 


480 


tccaccgacc 


ttaaagacat 


cctgagcgag 


gccagcacca 


tgcaactcct 


tcagcaacag 


540 


cagcaggaag 


cagtatccga 


aggcagcagc 


agcgggagag 


ccagggaggc 


ctcgggggct 


600 


cccacttcct 


ccaaggacaa 


ttacttaggg ggcacttcga ccatttctga 


cagcgccaag 


660 


gagctgtgta 


aggcagtgtc 


ggtgtccatg 


ggcttgggtg 


tggaggcgtt 


ggagcatctg 


720 


agtccagggg 


aacagcttcg 


gggggattgc 


atgtacgccc 


cagttttggg 


agttccaccc 


780 


gctgtgcgtc 


ccactccgtg 


tgccccattg gccgaatgca aaggttctct 


gctagacgac 


840 
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agcgcaggca 


agagcactga 


agatactgct 


gagtattccc 


ctttcaaggg 


aggttacacc 


900 


aaagggctag 


aaggcgagag 


cctaggctgc 


tctggcagcg 


ctgcagcagg 


gagctccggg 


960 


acacttgaac 


tgccgtccac 


cctgtctctc 


tacaagtccg 


gagcactgga 


cgaggcagct 


1020 


gcgtaccaga 


gtcgcgacta 


ctacaacttt 


ccactggctc 


tggccgggcc 


gccgccccct 


1080 


ccaccgcctc 


cccatcccca 


cgctcgcatc 


aagctggaga 


acccgctgga 


ctatggcagc 


1140 


gcctgggcgg 


ctgcggcggc 


gcagtgccgc 


tatggggacc 


tggcgagcct 


gcatggcgcg 


1200 


ggtgcagcgg 


gacccggctc 


tgggtcaccc 


tcagcggccg 


cttcctcatc 


ctggcacact 


1260 


ctcttcacag 


ccgaagaagg 


ccagttgtat 


ggaccgtgtg 


gtggtggggg 


cggcggcggt 


1320 


ggcggcggcg 


gcggcggcgc 


aggcgaggcg 


ggagctgtag 


ccccctacgg 


ctacactcgg 


1380 


ccacctcagg 


ggctggcggg 


ccaggaaggc 


gacttcaccg 


cacctgatgt 


gtggtaccct 


1440 


ggcggcatgg 


tgagcagagt 


gccctatccc 


agtcccactt 


gtgtcaaaag 


cgagatgggc 


1500 


ccctggatgg 


atagctactc 


cggaccttac 


ggggacatgc 


ggttggagac 


tgccagggac 


1560 


catgttttgc 


caattgacta 


ttactttcca 


ccccagaaga 


cctgcctgat 


ctgtggagat 


1620 


gaagcttctg 


ggtgtcacta 


tggagctctc 


acatgtggaa 


gctgcaaggt 


cttcttcaaa 


1680 


agagccgctg 


aagggaaaca 


gaagtacctg 


tgtgccagca 


gaaatgattg 


cactattgat 


1740 


aaattccgaa 


ggaaaaattg 


tccatcttgc 


cgtcttcgga 


aatgttatga 


agcagggatg 


1800 


actctgggag 


cccggaagct 


gaagaaactt 


ggtaatctga 


aactacagga 


ggaaggagag 


l'860 


gcttccagca 


ccaccagccc 


cactgaggag 


acagcccaga 


agctgacagt 


gtcacacatt 


1920 


gaaggctatg 


aatgtcagcc 


catctttctg 


aatgtcctgg 


aagccattga 


gccaggtgtg 


1980 


gtgtgtgctg 


gacatgacaa 


caaccagccc 


gactccttcg 


cagccttgct 


ctctagcctc 


2040 


aatgaactgg 


gagagagaca 


gcttgtacat 


gtggtcaagt 


gggccaaggc 


cttgcctggc 


2100 


ttccgcaact 


tacacgtgga 


cgaccagatg 


gctgtcattc 


agtactcctg 


gatggggctc 


2160 


atggtgtttg 


ccatgggctg 


gcgatccttc 


accaatgtca 


actccaggat 


gctctacttt 


2220 


gcccctgatc 


tggttttcaa 


tgagtaccgc 


atgcacaagt 


cccggatgta 


cagccagtgt 


2280 


gtccgaatga 


ggcacctctc 


tcaagagttt 


ggatggctcc 


aaatcacccc 


ccaggaattc 


2340 


ctgtgcatga 


aagcgctgct 


actcttcagc 


attattccag 


tggatgggct 


gaaaaatcaa 


2400 


aaattctttg 


atgaacttcg 


aatgaactac 


atcaaggaac 


tcgatcgtat 


cattgcatgc 


2460 


aaaagaaaaa 


atcccacatc 


ctgctcaagg 


cgtttctacc 


agctcaccaa 


gctcctggac 


2520 


tccgtgcagc 


ctattgcgag 


agagctgcat 


cagttcactt 


ttgacctgct 


aatcaagtca 


2580 


cacatggtga 


gcgtggactt 


tccggaaatg 


atggcagaga 


tcatctctgt 


gcaagtgccc 


2640 


aaaatccttt 


ctgggaaagt 


caagcccatc 


tatttccaca 


cccagtgaag 


cattggaaat 


2700 


ccctatttcc 


tcaccccagc 


tcatgccccc 


tttcagatgt 


cttctgcctg 


ttataactct 


2760 


gcactactcc 


tctgcagtgc 


cttggggaat 


ttcctctatt 


gatgtacagt 


ctgtcatgaa 


2820 



C 2821 
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<210> 7 
<211> 329 
<212> DNA 

<213> Macaca mulatta 
<400> 7 

tctcaagagt ttggatggct ccaaatcacc ccccaggaat tcctgtgcat gaaagcgctg 60 
ctactcttca gcattattcc agtggatggg ctgaaaaatc aaaaattctt tgatgaactt 120 
cgaatgaact acatcaagga actcgatcgt atcattgcat gcaaaagaaa aaatcccaca 180 
tcctgctcaa ggcgtttcta ccagctcacc aagctcctgg actccgtgca gcctattgcg 240 
agagagctgc atcagttcac ttttgacctg ctaatcaagt cacacatggt gagcgtggac 300 
tttccggaaa tgatggcaga gatcatctc 329 

<210> 8 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PGR primer 
<400> 8 

atggaggtgc agttagggct g 21 

<210> 9 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PGR primer 
<400> 9 

ggtcttctgg ggtggaaagt a 21 
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<210> 10 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR primer 
<400> 10 

acggctacac tcggccacct c 

<210> 11 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR primer 
<400> 11 

aacaggcaga agacatctga a 

<210> 12 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR primer 
<400> 12 

ggcggccgag ggtagaccct c 



PCT/US02/14175 



21 



21 



• 21 
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